IronWASP - Open Source Advanced Web Security Testing Platform: May 2013

Wednesday, May 29, 2013

False Negative Detection Support in IronWASP

NOTE: Before reading this post I would strongly recommended reading the introduction post that covers basics of web security scanner functioning and False Negatives

False Negative is not a word you hear often in the security industry and for good reason too. False Negatives eventually go on to become 0-days and they are the manifestation of the limitations in our knowledge, tools, techniques, processes and skills.

Every single day there is a constant attack against this limitation, it could be in the form of a researcher coming up with a new technique or a tool developer introducing a better algorithm or something similar. But the unknown is infinite and for that reason False Negatives shall remain an incurable condition.

Having said that, we can get better results against False Negatives by casting a wider net. The wider, the better. This is where Anomaly detection come into play, it's as wide as it gets. An Anomaly is simply a deviation from the normal. The basis of Anomaly based detection is that any factor that causes an anomaly to occur is potentially a problem that requires being looked into.

In the context of web security scanning, an anomaly is the application behaving differently than it normally would. This could be in the form of the application returning a page with different text than it normally would, or the application taking a longer or shorter time duration to respond than it normally would etc.

A web security scanner sends a wide range of payloads as part of its many vulnerability checks, these payloads could include almost the entire spectrum of what is typically considered bad input. There is a very high probability then that these payloads would trigger anomalies to occur in sections of the application that are vulnerable. If these anomalies are identified and manually investigated then the tester could potentially identify vulnerabilities that the signature based scanner misses out.

As you can see the concept of Anomaly detection is fairly straight forward and I am sure I am not the only scanner developer to think of this. In fact, during a random conversation with @skeptic_fx I discovered that he had been quietly toying with this idea in his head. This then begs the question, why has nobody implemented this already? The answer to that lies in how the existing scanners are designed. They are designed to provide only a Boolean output as the result of a vulnerability scan - 'I found a vulnerability' or 'I didn't find a vulnerability'. Because of this, users are also conditioned to expect only this kind of a result from the scanner. Anomaly detection does not naturally fit in to this design and retrofitting it would negatively affect the user's perception of the scanner. The user might look at the thousands of anomalies reported and declare that the scanner produces too many False Positives!

IronWASP has a completely different approach to what results a scan must produce. Of Course you have the regular list of identified vulnerabilities reported like all the other scanners but it doesn't end there. The scanner also has a treasure trove of information about how the application behaved for the different payloads that it sent, this information can take away the need for any additional manual or automated fuzzing and bring down the testing effort and time by a significant margin. IronWASP has been the only scanner to make this information available to the user in a structured form suitable for analysis, the others have been wasting this data like the Cheese manufactures throwing away Whey!.

Right from its very first version IronWASP has been logging this information under the Scan Trace section along with every single request and response associated with the scan. With every newer version progressive improvements have been made to the Scan Trace section, the next natural step in this trajectory is automated anomaly detection from the Scan Trace data.

Let's now see how this is implemented.

IronWASP's False Negative Detection Support:

The latest version of IronWASP has a feature named 'Payload Effect Analyzer'. Once an automated scan is completed on an application, the Payload Effect Analyzer can be launched to start an automated analysis of the scan trace data. For each payload that was sent by the scanner, the analyzer compares the corresponding response (payload response) against the response that was received for same request without the payload (normal response). Any variations between these two responses is most likely caused by the payload that was sent by the scanner. The analyzer identifies all these variations or anomalies from the scan trace data and lists them for manual analysis by the user.

The analyzer looks for variations in the following factors between the Normal Response and the Payload Response:

1) Response Code: Checks if the HTTP Status Codes of both the responses are different

2) Response Content: Checks if the 'payload response' has text that is missing from the 'normal response'. There is a possibility that this new text in the 'payload response' might be some kind of an exception detail or error message triggered by the injected payload, so these are picked up by the analyzer.

By default there must at least be 20 characters of text exclusively found in the 'payload response' for it to be reported. This number can however be modified by the user before starting the analysis.

3) Response Time: Checks if there is significant difference in the time taken for both the responses to be received from the server.

By default significant difference is considered as one of these two:

If the difference in the response time of the both the responses is more than 1000 milliseconds
If the response time of one response is more than 10 times the response time of the other response.

These are the default settings, they can be modified by the user before starting the analysis.

4) Response Headers: Checks if any HTTP headers present in one response is missing from the other response.

5) Response Set-Cookie values: Checks if any cookie values set by one response are different from the other response.

In addition to these, the analyzer can also detect the presence of certain keywords in the payload response that were not present in the normal response. By default the keywords that are searched for are 'error', 'exception', 'not allowed', 'unauthorized', 'blocked', 'filtered', 'attack', 'unexpected', 'sql', 'database' and 'failed'. The user can update this list before starting the analysis.

Now that we have looked at the theory behind all this, let's see how this feature can be used.

As mentioned earlier, before the Payload Effect Analyzer can be launched, an automated vulnerability scan of the target application must be performed. This article explains how to start an automated scan using IronWASP.

After the scan is complete head to the 'Scan Trace' section located inside the 'Automated Scanning' section and click on the 'Launch Payload Effect Analyzer' button. This will open a the analyzer in a new window.

Clicking the 'Launch Payload Effect Analyzer' button opens the Payload Effect Analyser in a new window

The scan trace selection settings available in the Payload Effect Analyser window, the default settings cover the entire scan trace database

The analyzer window gives a few options to narrow down which section of the scan trace must be included in the analysis. The user can provide the range of scan trace ids which must be analyzed and/or specify scan trace generated by which vulnerability checks must be included. If the user is not satisfied with the payloads that the existing vulnerability checks send then a new vulnerability check can be created which sends payloads provided by the user. Creating this new check does not require any programming knowledge and can be done in less than a minute by making use of the 'Active Plugin Creation Assistant' utility. This utility can be accessed from the 'Coding Assistant' section of the 'Dev Tools' menu.

Before starting the analysis if the user wants to change the default configuration settings for the analyzer then it can be done by clicking on the 'Show Config Panel' link.

Options to configure the Payload Effect Analyzer

The analysis can now be started by clicking on the 'Start Analysis' button. As the analysis happens any anomalies detected are immediately reported to the user.

Analysis is in progress, anomalies identified so far are already reported

The results list contains brief information about why a particular entry was added. It provides the id of the scan trace on which the anomaly has been detected and following six fields explain why it was reported.

1) Code Variation: this fields indicates if any of the payloads caused the application to return a response with a different HTTP status code than the normal response.

2) Keywords Inserted: this field indicates if the payloads caused the application to return a response that contained any of the keywords in the analysis configuration.

3) Body Variation: if any of the payloads caused a significant change in the response content then this field is populated with the number of characters of difference between the payload response and normal response. If more than one payload caused a significant difference in response content then the maximum change is shown here.

4) Set-Cookie Variations: this field indicates whether any of the payloads caused a difference in the Set-Cookie HTTP header section of the responses.

The differences that are picked up are:

Normal response did not have a Set-Cookie header but payload response has this header
Normal response had a Set-Cookie header but the payload response does not have this header.
Both the normal response and payload response has Set-Cookie headers but their values are different.

5) Headers Variation: this field indicates whether any of the payload caused a difference in the HTTP header section of the responses.

The differences that are picked up are:

Header present in normal response is missing in the payload response
Header that was not present in normal response is present in payload response

6) Time Variation: if any of the payloads caused a significant change in the time taken for the response to be returned then this field is populated with the number of milliseconds of difference between the payload response and normal response. If more than one payload caused a significant difference in response time then the maximum change is shown here.

Clicking on any of these result entries shows more details about the selected entry. Let's pick one and analyze more.

Summary of selected result

The summary section explains what anomalies were discovered in this particular trace entry. In this cases there is variation in the response code, content, headers, cookie values and response times. Also occurrences of two of the keywords have been detected.

Let's probe further into one of these anomalies. The summary section says that some payload caused the application to return a response that was having two new cookie values - 'amUserInfo' and 'amUserId'. To know more about this let's go to the 'Set-Cookie Variations' tab.

Set-Cookie variations section of the selected result. This sections shows the exact payload sent and the corresponding Set-Cookie variation

Here you can see the exact payloads that triggered the introduction of these new cookies. The Log ID of the corresponding request and response is also shown here so we can take a look at them. The logs associated with this scan trace entry can be viewed by clicking on the 'Load Selected Trace Entry in Trace Viewer' button, this opens the selected trace in a separate window.

Scan Trace Viewer showing all the requests and responses associated with this scan

In the Trace Viewer we can locate the exact log using the log id observed earlier. Clicking on it shows the associated request and response below. By looking at the request we learn that this is actually a login request (the application being scanned in this case is demo.testfire.net which is a test site maintained by the IBM AppScan team).

The associated response shows that the application is redirecting us to the welcome page which is only accessible after a successful authentication. And the cookies that are being set are to maintain our new authenticated session. Based on this we can deduce that it is possible to bypass the authentication in the login page using SQL injection. IronWASP does report that the login section is vulnerable to SQL Injection but it does not detect the authentication bypass. But by using the Payload Effect Analyzer we are able to pick this up quite easily.

If you want to check how different this particular response is from the responses received for some of the other payloads then we can do that in three simple steps.

Step 1: Change the 'Click Action' to 'Select Log', this enables the selection of multiple logs.

Step 2: Click on the two logs which must be compared. A check mark be placed on the selected logs.

Step 3: After selecting exactly two logs click on the 'Diff Selected Sessions' button. This will open a new window that will highlight all the differences in the requests and responses between the selected logs.

Once 'Click Action' is set to 'Select Log' it is possible to select multiple logs and do a diff on the requests and responses.

Differences in the requests from the selected logs is highlighted.

Differences in the responses from the selected logs is highlighted

This was a simple scenario just to illustrate how the analyzer works. In real world this feature can help you identify anomalies that you might miss out even during manual testing. And if this feature is used properly then you can remove the monotonous repeated manual payload injection work from your tests. And instead reserve manual injection for only cases where the analyzer detects anomalies.

False Positive Detection Support in IronWASP

NOTE: Before reading this post I would strongly recommended reading the introduction post that covers basics of web security scanner functioning and False Positives

When a scanner reports a vulnerability, the user is left with the responsibility of determining if the reported vulnerability actually exists. How the user performs this task depends on how deeply the user understands web security and the vulnerability in question. Most non-security users (Functionality Testers/Developers/QA etc.) are left scratching their heads at this point.

Even for a skilled penetration tester this task isn't exactly a walk in the park. Now would you believe me if I said that for a penetration tester, writing off a reported issue as a false positive can at times be trickier than discovering a similar vulnerability manually? Let me explain why that is the case.
When trying to manually discover a vulnerability, a tester would perform a series of probes and observe how the application behaves. If the probes elicited a favourable behaviour then the tester does more tests to confirm the presence of a vulnerability. If the probes did not create an impact on the application's behaviour then that section of the site is termed secure and the tester starts probing another section.

But when it comes to testing for false positives the tester has to perform an additional step. If the manual probes don't indicate the presence of the reported vulnerability then the tester has to come up with a reason for what might have caused the scanner to report it incorrectly. Unless this can be done there hangs a cloud of suspicion around the issue. What if the tester did not know about a new or lesser known technique by which this vulnerability was reported by the scanner? What if the tester is wrong and the scanner is right!

To do this additional step the tester has to know exactly how the scanner detected the issue and why it reported it. The tester can only get this information from the vulnerability summary provided by the scanner and from the request and response pairs that are usually included along with the vulnerability summary. How useful and clear this information turns out to be depends on the scanner and the reported issue. Most black-box security scanners aspire to be black-boxes themselves, so they don't try to be generous with information about the detection techniques used.

In my observations the vulnerability summary information of Burp Scanner and Netsparker did a good job of explaining how a reported issue was detected, two scanners whose authors have been penetration testers themselves.

IronWASP's False Positive Detection Support:

IronWASP helps the user with this process by doing two things:

Explaining exactly how IronWASP detected this issue and why it was reported.
Giving instructions on how to manually test and determine if the reported issue is a False Positive.

The following screenshot shows the information included in the description of a Command Injection vulnerability detected by IronWASP on a test site.

When checking for a vulnerability IronWASP typically uses more than one technique, in fact for detecting SQL Injection IronWASP uses 5 different techniques. This is done to ensure coverage so that if one technique fails to identify an issue then another technique might pick it up.

When a vulnerability is reported from a scan then the reported issue has a list of reasons based on which IronWASP determined there was a vulnerability. Each detection technique that succeeded provides its own reason here. The reason section has detailed information on what payload was sent, information about the payload, what analysis was done on the response that came back and how IronWASP inferred the presence of a vulnerability. This description is given in simple and clear language so that it makes sense to even non-security users.

There are no blanks to fill or dots to connect, a penetration tester reading this would know beyond doubt exactly why this issue was reported and this makes it easy for them to reason why the scanner might have made a wrong detection in a particular case. This might not be a ground-breaking difference from what the other scanners do but the real world benefit this provides is surprisingly significant. The incremental benefit provided by this approach would accumulate in to hours of testing time saved per assessment.

Now for the first time non-security users have a realistic shot at picking out false positives reported by a scanner. This is because each reason section also contains simple and precise instructions on how to manually test and determine if the reported issue is a False Positive as per the detection technique used.

If the instructions for False Positive Check require the user to resend the same payload again or send a payload with modifications then it can be done from the Manual Testing section of IronWASP. The following screenshot shows how the user can use one of the requests sent by the scanner as a starting point for this process.

In addition to these, users with a greater appetite for information might find the scan trace section to be an added bonus. This section contains brief information about all (successful as well as unsuccessful) the individual tests done. To get a better idea please refer to the screenshot below of the test trace of the same Command Injection vulnerability.

IronWASP v0.9.6.0 with False-Positive and False-Negative Detection Support

The newest version of IronWASP (v0.9.6.0) comes with many improvements and features like the support for CLI based modules and the Module Creation Assistant. There is a new module included with this version, OWASP Skanda - SSRF Exploitation Framework.

But what makes this version very special is that it comes with two exclusive features that set it apart from all other web security scanners available today.

They are:

False Positive Detection Support
False Negative Detection Support

* Please note that in the current version these features don't apply to Cross-site Scripting vulnerabilities.

The False Positive Detection Support is provided by the scanner giving precise and detailed information on how a vulnerability was detected and why it was reported along with instructions on how to test if it is a False Positive.

The False Negative Detection Support is made possible through Anomaly detection. This is most likely the first time that Anomaly detection technique is used in the context of web security scanning.

Details on how these systems function and achieve their claimed goals is available below. But before that, if you are not very familiar with how web security scanners work and why False Positives and False Negatives occur, then the next section will bring you up to speed.

The Basics:

False Positives and False Negatives are an unfortunate reality with web vulnerability scanners. Before we delve into the details let's clarify the terminology first.

False Positive:

When a scanner reports that a particular vulnerability is present on the scanned application but in reality this vulnerability does not exist in the application, it is called a False Positive.

False Positives occur when a scanner incorrectly determines that a vulnerability is present in an application.

False Negative:

When a vulnerability is actually present in an application but a scanner fails to detect its presence, it is called a False Negative.

False Negatives occur when a scanner fails to detect an actual vulnerability present in an application.

All automated web security scanners available right now produce false positives and false negatives. As a matter of fact, if Einstein, Tesla and Hawking sat down together and wrote a scanner, even that would have false positives and false negatives.

To understand why that is the case we must first look at how web security scanners work. Scanners have a store of vulnerability signatures inside them. They send some payloads to the application, observe the responses that come back and check if the way the application behaved, matched any of the signatures stored inside them. When you dig beneath all the buzzwords, jargons and marketing hype, this is what you would find at the core of any open source or commercial scanner.

A simple signature could be that if the application returned the string 'Incorrect syntax near' when the ' (single quote) character was sent as payload to the application then it indicates the presence of Error-based SQL Injection.

Signatures could be in orders of magnitude more complicated than the example cited above but all of them are ultimately very similar in principle. A signature, no matter how complex is a predetermined template defining how an application with a particular vulnerability would behave.

Notice the emphasis on the word predetermined, that is where the trouble begins. Web applications are not made from a static mould. Each application is designed differently and behaves differently. Everything from the content, the way it is laid out, the logical flow, to the way errors are handled is different in each application. Given this factor, it is not possible to come up with a template that could accurately apply to all web applications.

Making the template very strict and specific could reduce the number of False Positives reported by the scanner but doing that would result in a lot of False Negatives. Making the template relaxed might reduce the possibilities for False Negatives but would greatly increase the False Positive count. So in general, when creating the signatures, the scanner developers settle on an optimal point between the two extremes explained above.

Now that we have dabbled a bit in the basics of web security scanner working, we can look at how the newest version of IronWASP helps the users in dealing with the inevitable problems of False Positives and False Negatives.

I have explained this in two separate posts: