Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set match_limit and match_limit_recursion values to regex variables #2071

Closed

Conversation

airween
Copy link
Member

@airween airween commented Apr 16, 2019

Today there were some regex related issues under owasp-modsecurity-crs, especially:

SpiderLabs/owasp-modsecurity-crs#1354
SpiderLabs/owasp-modsecurity-crs#1356
SpiderLabs/owasp-modsecurity-crs#1357
SpiderLabs/owasp-modsecurity-crs#1358
SpiderLabs/owasp-modsecurity-crs#1359

These issues affected libmodsecurity3, because some rule can triggers the catastrophic backtrace.

The "bug" exists because there aren't set up the match_limit and match_limit_recursion members and flags on the pcre_study object. The result in practice will an overloaded HTTP daemon, which uses libmodsecurity3, eg. Nginx.

As a quick fix, I've set up the default values as 1500 for both values.

ModSecurity2 can avoid the overloaded state with SecPcreMatchLimit and SecPcreMatchLimitRecursion

@zimmerle
Copy link
Contributor

Hi @airween,

ModSecurity shares memory space within the NGINX, therefore setting this on ModSecurity will impact on NGINX behavior which has its own constraints. That used to be a problem with ModSecurity v2 especially on php was handling its own limits. That was the main reason why this was not taken into consideration on v3. This subject was discussed back on 2013.

To illustrate: #56, #267, #1176, #1481, and #1290

Setting those values is no guarantee that it will solve the problem. Yet can impact very badly in other modules that use PCRE. The PCRE itself already contains hardcoded limits that can be changed in compilation time. Apparently, those are not enough and lead to the recursion problem. We cannot guarantee that any other module or NGINX itself will not change those values. Potentially leading the less experienced user to believe that he/she is protected when in reality they are not.

We try to abstract complexity for those who may write the rules, but I am afraid that we won't be able to help in this one. I understand that you want to protect ModSecurity instance from particular regular expressions that may not be written in an elegant shape. Or am I wrong?

I am wondering if there is another way to solve that without setting a limit that we don't have any control and potentially lead to unexpected behavior. I am closing the pull request for the reasons explained and opening a new issue to discuss that matter.

From pcre manual

"The default value for the limit can be set when PCRE is built; the default default is 10 million, which
handles all but the most extreme cases. You can override the default by suppling pcre_exec() with a 
pcre_extra block in which match_limit is set, and PCRE_EXTRA_MATCH_LIMIT is set in the flags
field. If the limit is exceeded, pcre_exec() returns PCRE_ERROR_MATCHLIMIT."

Limiting the recursion depth limits the amount of stack that can be used, or, when PCRE has been 
compiled to use memory on the heap instead of the stack, the amount of heap memory that can be
used.

The default value for match_limit_recursion can be set when PCRE is built; the default default is the
same value as the default for match_limit. You can override the default by suppling pcre_exec() with
a pcre_extra block in which match_limit_recursion is set, and PCRE_EXTRA_MATCH_LIMIT_RECURSION is set in the flags field. If the limit is exceeded,
pcre_exec() returns PCRE_ERROR_RECURSIONLIMIT.

@zimmerle zimmerle closed this Apr 17, 2019
@zimmerle zimmerle self-assigned this Apr 17, 2019
@zimmerle zimmerle self-requested a review April 17, 2019 00:11
@zimmerle zimmerle added 3.x Related to ModSecurity version 3.x workaround available The issue has either a temporary or permanent workaround available labels Apr 17, 2019
@zimmerle
Copy link
Contributor

@s0md3v -- mentioning so you got a notification, as you are the reporter of the issues cited by @airween. Again, thank you for your work. I am sure a lot of people will be benefited from it.

@fgsch
Copy link

fgsch commented Apr 17, 2019

@zimmerle I'm not sure I understand. Those limits are specific to the pcre_exec() call (the extra argument). How changing the limits in modsecurity will affect nginx, or anything else for the matter?

@s0md3v
Copy link

s0md3v commented Apr 18, 2019

I think PCRE limits should be set but if you believe that it breaks other components of a web server (which it shouldn't), you can do the following:

  1. Add (x+)+y as a rule
  2. Add PCRE limits
  3. Make a request with the string `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz
  4. Does it cause problems in other components of the web server?
    • Yes, it breaks stuff. Don't merge this PR.
    • No, everything works as intended. Merge this PR.

I haven't looked into the code but it's really crucial what you do after the timeout. Ideally, the request should be blocked if the regex matching times out otherwise it can be used to bypass the WAF itself.

@airween
Copy link
Member Author

airween commented Apr 18, 2019

Hi @zimmerle,

thanks for detailed response.

Sorry to tell you, but I'm not sure that understand you completely.

You've listes some issues, I've reviwed them, but can't compare anyone with this one.

#56:
issue: Nginx eats cpu
solution: PR #67, added response body filter, plus PR #96, which also modified only Nginx

#267
issue: same problem like @s0md3v found
solution: added SecPcreMatchLimit and SecPcreMatchLimitRecursion (didn't found PR)

#1176
issue: the request triggers "PCRE limit exceeded" messages
solution: no solution, "It worth to check." :)

#1481
issue: ModSecurity: SecPcreMatchLimit not allowed in VirtualHost
solution: SecPcreMatchLimit and SecPcreMatchLimitRecursion are globals, can't use them in VirtualHost. This is a strict rule, I agree.

#1290
issue: SecPcreMatchLimit and SecPcreMatchLimitRecursion not follwed in modsecurity.conf
solution: not sure that there was any solution, @bostrt wrote that needs to build modsec with macros.

Note for #1176: I think that this message in the log is important and relevant. It informs the admin, who runs ModSec. Till we don't know, what casuses this problem, we can't speak about PHP or any other components modify those limits. Till we don't know any other about the cause, we can't imagine the real problem (I mean it would be good to see the other logs, where exceeds the limit...)

I think that from the issues above only one affects same problem like here, and the solution was added SecPcreMatchLimit and SecPcreMatchLimitRecursion options. May be I don't see somthing, or don't understand some important thing... really sorry. If you will have some free time, please help me to explain them.

I understand and agree, that this modification should breaks another modules, we don't know that now. In case of v2 and Apache, I didn't found any issue, which confirms this idea, but can't close out, of course. (And this is a rather weak argument... :))

May be we should try it in v3, discuss with PCRE devs about hardcoded limits, which can cause any effect.

In some cases you as the library's main developer can decide that from this point you will not follow the old hacks - I totally agree. But where should users know that these options aren't supported? From an outdated documentation? :)

I think, we have to solve this issue somehow. My suggestion:

  • add these two configuration directives to v3, but only if the user enables them in compilation time; the default should "not supported"
  • if directives added, I would NOT let user to add default value, eg. in compilation time also
  • if directives added, but not used in config, then they will not have effect
  • if directives added, and user set them with explicit value, then they will have effect (including logs the reach of limit recursion...)
  • start to make the documentation for v3 (and v3.1... :)), add extra highligts to differences between v2 and v3 (and v3 and v3.1)

What do you think? :)

@airween
Copy link
Member Author

airween commented Apr 18, 2019

@fgsch
here is a very ugly example:
https://gist.github.com/airween/b3a158bb21938a4e1c8a41c1568ed9c7

Consider that the ugly() function is a small part of a library, and it has an argument, what you can use to set up a 'limit'. The main() function calls it few times, passes the argument, and you think that you've set up the necessary value. You can see, that the value will not change if you pass 0.

I think this is a very wrong example, I don't think that there are any similar solution eg. in PHP, or other modules... but you can't exclude the case, when somebody forgot something - we can call it as 'bug' :)

HTH.

@airween
Copy link
Member Author

airween commented Apr 18, 2019

@s0md3v

   * No, everything works as intended. Merge this PR.

it's not that simple. :)

There should be many other case, what you can't cause with only one test. For example, you don't know all nginx module (what does it mean all module?), and you can't test the combinations of (theoretically all) modules, and all available inputs...

@zimmerle
Copy link
Contributor

@zimmerle I'm not sure I understand. Those limits are specific to the pcre_exec() call (the extra argument). How changing the limits in modsecurity will affect nginx, or anything else for the matter?

Hi @fgsch

It is wrong to think that there isn’t a limit. There is. The limit is set by the PCRE library itself. Maybe in the case that you have tested these limits were not low enough. Because it depends on the conjunction of two different things: Input and Regex. Occasionally, the value may be too low depending on the factors listed. Of course all that depends on the factor that you have to accept a regex that might not take the ReDoS scenario into account.

In ModSecurity v2, the limits could be set in different manners:

  • Considers the default values set on pcre;
  • Considers the values informed by the user during the pcre compilation;
  • Considers the value informed ModSecurity compilation;
  • Considers the value informed during the rules load.

In ModSecurity v3 the user can set the limits by:

  • Considering the values set on pcre;
  • Considering the values informed by the user during the pcre compilation.

We have all those different options to set the limits, however, even having those options, there aren't "correct values" per se. Is more about an educated guess. A value that a inexperienced user may not have a clue about - not to mention that we need to support: ARM, Windows, HPUX, Solaris, etc….. That kind of configuration forces the user to consult an expert, it is not too userfriendly. Depending on the scenario, could lead to a circumstance where the user thinks he is protected but in the reality he is not.

As soon as I have some time next week I'll try to dig the historical reasons behind this rationale but we had users complaining that setting pcre limits on PHP was affecting ModSecurity as well. Generally, those limits raise doubt and questions from users.

IMHO we should take the heavy lifting to ourselves and implement something innovative that will make these settings friendlier to our users. We can look at v2 to find what to do and what not to do and take this opportunity to think how we can make it better :)

@zimmerle
Copy link
Contributor

I think PCRE limits should be set but if you believe that it breaks other components of a web server (which it shouldn't), you can do the following:

  1. Add (x+)+y as a rule

  2. Add PCRE limits

  3. Make a request with the string `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz

  4. Does it cause problems in other components of the web server?

    • Yes, it breaks stuff. Don't merge this PR.
    • No, everything works as intended. Merge this PR.

I haven't looked into the code but it's really crucial what you do after the timeout. Ideally, the request should be blocked if the regex matching times out otherwise it can be used to bypass the WAF itself.

I am afraid I have to disagree with you. I think we should not accept regex that could lead to ReDoS at all. For varied reasons:

  • Have a quantization for the limit is an equation that is based on an educated guess that demands skills and previous knowledge. That goes against one of the v3 objectives: to be more user-friendly.
  • Considering that we have a value to be set, what should be the default value? How to guarantee that this same value is ok regardless of platform? There won’t be a regex and/or input value that will be an exception to this value?
  • A rules load problem treated on run time: At my perspective, we should avoid loading a rule that contains a potentially dangerous regular expression. Instead, we should inform the user to write the regex in an elegant shape. Having this validation in run time creates an additional step which is to treat the circumstances where the rule was blocked due to reaching the limits of ReDoS.
  • How HyperScan and RE2 treat it? We are close to adding support to those; the limits are an option on them?
  • We perform an educated guess on the number based on the amount of time to validate a regex; having said that, for other reasons the workers already contains an execution timeout. Right?

Having said all that, ideally we shouldn't ignore regexes that do not take into consideration REDoS risks.

@zimmerle
Copy link
Contributor

Hi @zimmerle,

thanks for detailed response.

Sorry to tell you, but I'm not sure that understand you completely.

You've listes some issues, I've reviwed them, but can't compare anyone with this one.

#56:
issue: Nginx eats cpu
solution: PR #67, added response body filter, plus PR #96, which also modified only Nginx

#267
issue: same problem like @s0md3v found
solution: added SecPcreMatchLimit and SecPcreMatchLimitRecursion (didn't found PR)

#1176
issue: the request triggers "PCRE limit exceeded" messages
solution: no solution, "It worth to check." :)

#1481
issue: ModSecurity: SecPcreMatchLimit not allowed in VirtualHost
solution: SecPcreMatchLimit and SecPcreMatchLimitRecursion are globals, can't use them in VirtualHost. This is a strict rule, I agree.

#1290
issue: SecPcreMatchLimit and SecPcreMatchLimitRecursion not follwed in modsecurity.conf
solution: not sure that there was any solution, @bostrt wrote that needs to build modsec with macros.

Note for #1176: I think that this message in the log is important and relevant. It informs the admin, who runs ModSec. Till we don't know, what casuses this problem, we can't speak about PHP or any other components modify those limits. Till we don't know any other about the cause, we can't imagine the real problem (I mean it would be good to see the other logs, where exceeds the limit...)

I think that from the issues above only one affects same problem like here, and the solution was added SecPcreMatchLimit and SecPcreMatchLimitRecursion options. May be I don't see somthing, or don't understand some important thing... really sorry. If you will have some free time, please help me to explain them.

I understand and agree, that this modification should breaks another modules, we don't know that now. In case of v2 and Apache, I didn't found any issue, which confirms this idea, but can't close out, of course. (And this is a rather weak argument... :))

May be we should try it in v3, discuss with PCRE devs about hardcoded limits, which can cause any effect.

In some cases you as the library's main developer can decide that from this point you will not follow the old hacks - I totally agree. But where should users know that these options aren't supported? From an outdated documentation? :)

I think, we have to solve this issue somehow. My suggestion:

  • add these two configuration directives to v3, but only if the user enables them in compilation time; the default should "not supported"
  • if directives added, I would NOT let user to add default value, eg. in compilation time also
  • if directives added, but not used in config, then they will not have effect
  • if directives added, and user set them with explicit value, then they will have effect (including logs the reach of limit recursion...)
  • start to make the documentation for v3 (and v3.1... :)), add extra highligts to differences between v2 and v3 (and v3 and v3.1)

What do you think? :)

As stated in the reply of other users in this thread I think we should spend efforts to have the rules that contain a possible ReDoS to be promptly displayed to the user and refused to be loaded. Meaning: Treat the problem while loading the rules, not at runtime. There are other mechanisms to kill a process for a timeout. Rules that can lead to a problem - and could be promptly identified - should not be loaded in the first place.

You mentioned that the documentation does not state that PCRE limits are not supported. Reading the Reference Manual, it shows as "To be Implemented" (TBI). The code also has comments in that regards as well. @airween I am sorry if you are disappointed given the fact that I am not accepting your PR promptly. The reason for that is so everybody has the chance to read and discuss. At the end of the day, we are not advocating on our own but to have the best WAF for our users. Sometimes it is frustrating even to me. The decision is ours after long sessions of brainstorming ideas, which is what I am doing right now.

If you have any further doubt, I will be more than glad to answer and discuss, but let's not be incisive with assumptions.

@zimmerle
Copy link
Contributor

@fgsch @airween @s0md3v

Sorry for being somewhat brief on the explanations here. Away from my keyboards due to Easter holidays Are you guys available to a quick hangout session on Monday? What do you think about promptly identifying the possible ReDoS regex?

Can anybody send me a rule and a cURL command or test case that trigger the issue on v3+nginx? Please do not publish here, send me over email. My key is 0xB11277.

Thank you.

@airween
Copy link
Member Author

airween commented Apr 21, 2019

Hi @zimmerle,

As stated in the reply of other users in this thread I think we should spend efforts to have the rules that contain a possible ReDoS to be promptly displayed to the user and refused to be loaded. Meaning: Treat the problem while loading the rules, not at runtime. There are other mechanisms to kill a process for a timeout. Rules that can lead to a problem - and could be promptly identified - should not be loaded in the first place.

yes, I think I understood this after your first note under the #2072 - see my comment

You mentioned that the documentation does not state that PCRE limits are not supported. Reading the Reference Manual, it shows as "To be Implemented" (TBI). The code also has comments in that regards as well.

yes, you're right - sorry for this mistake.

@airween I am sorry if you are disappointed given the fact that I am not accepting your PR promptly.

I was not disappointed :). I'm not aiming to flood the project with PR's, but rather to rational repair and learning. And didn't sent it that you merge it promptly, my goal was to show a possible solution, and start the discuss.

The reason for that is so everybody has the chance to read and discuss. At the end of the day, we are not advocating on our own but to have the best WAF for our users. Sometimes it is frustrating even to me. The decision is ours after long sessions of brainstorming ideas, which is what I am doing right now.

Agree.

If you have any further doubt, I will be more than glad to answer and discuss, but let's not be incisive with assumptions.

I didn't want to live with assumptions, I'm sorry if this can be interpreted. I've been looking at all the PR's that you referred to avoid the assumes.

@fgsch
Copy link

fgsch commented Apr 21, 2019

@zimmerle I'm familiar with the PCRE API (both 2 and 3), and I know there are limits at built time, but I think you are conflating several issues here.

Technically there is no reason to not expose these limits. I do agree that they might fall into the advance user category though, but they should not be hidden. Instead they should be documented thoroughly so they are understood.

We should also fix the regexen, no questions here. But to me these two things are orthogonal.

Any improvements on how this is done can be added at a later time, when the code is ready.
Waiting for the perfect solution doesn't really help anyone.

@s0md3v
Copy link

s0md3v commented Apr 21, 2019

Did you guys looked at this?
I shared a methodology to spot exploitable regular expressions.
Maybe you want to fix the existing patterns and then be careful when a new addition is made.

@fgsch
Copy link

fgsch commented Apr 21, 2019

Just to expand on my previous comment, and to answer @s0md3v, fixing CRS will only cover the rules we control. Anyone could write custom rules so the scope here is beyond the reported issues.
As such I strongly believe we should provide a mechanism to help all users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x workaround available The issue has either a temporary or permanent workaround available
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants