Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several heap-BoF crashes from pcre2_compile related to parsed_pattern #561

Closed
stevenagy opened this issue Nov 13, 2024 · 18 comments
Closed

Comments

@stevenagy
Copy link

We found several crashing files (attached) through fuzzing the following program:

#include <stdio.h>
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>

int main(int argc, char *argv[])
{
    FILE *f = fopen(argv[1], "rb");
    fseek(f, 0, SEEK_END);
    long size = ftell(f);
    rewind(f);
    char *fuzzData = (char*)malloc((size_t)size+1);
    fread(fuzzData, (size_t)size, 1, f);
    fuzzData[size] = '\0';

	int v0 = 1;
	size_t v1 = 1;
	pcre2_code_8* pcre2_compile_8val1 = pcre2_compile_8((PCRE2_SPTR8)fuzzData, size, 0, &v0, &v1, NULL);
   	return 0;
}

All crashes are heap-buffer-overflows stemming from the following source lines:

  • pcre2_compile.c:4014
  • pcre2_compile.c:4030
  • pcre2_compile.c:4304
  • pcre2_compile.c:4306
  • pcre2_compile.c:4307

These all appear to involve the parsed_pattern object.

We were able to reproduce the crashes using the PCRE2 OSS-Fuzz harness here after modifying it to read data from file: https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_fuzzsupport.c.

Any feedback on these crashes would be greatly appreciated. :)
crashes.zip

@carenas
Copy link
Contributor

carenas commented Nov 13, 2024

Any feedback on these crashes would be greatly appreciated.

just to be clear (and without looking).

are all those crashes and their line numbers associated with a specific revision of the code?, how about the latest released version?

@stevenagy
Copy link
Author

We originally found them on version a67878318cd08eb92e7f3afa2a15b55d46d285e2.

@stevenagy stevenagy reopened this Nov 13, 2024
@stevenagy
Copy link
Author

(sorry, accidentally hit close)

I just tested on the latest version and they appear there too. Happy to provide any other info on request.

@carenas
Copy link
Contributor

carenas commented Nov 13, 2024

crashes on unreleased code are to be expected (indeed I know at least of another one I had yet to fix), it would be very important to know though if you can crash a released version of the code.

@NWilson
Copy link
Member

NWilson commented Nov 13, 2024

I have looked into this, because the code mentioned by @stevenagy is in the area I've been working on recently.

I reproduce with:

  • a67878318cd08eb92e7f3afa2a15b55d46d285e2 (reported by @stevenagy), which is today's master
  • f1c2406adf628abd402cda7113cfa4c028e7974e (my current Perl extended classes branch)

It does not reproduce with 6ae58beca071f13ccfed31d03b3f479ab520639b (PCRE2 10.44 latest release). What a relief!

@stevenagy
Copy link
Author

Interesting! Thanks for confirming. Does a crash on the latest master suggest this may be a regression? We're admittedly unfamiliar with PCRE2's policies on how the mainline repo differs from official release versions. :)

In our fuzzing campaigns we by default just pulled libraries' latest master versions. I'd be happy to do a re-test with PCRE2's most recent official release if you would find that more helpful!

@NWilson
Copy link
Member

NWilson commented Nov 13, 2024

The policy is (I believe) that unreleased code, on master branch, is a work-in-progress. We make no guarantees, we do not recommend it to be used in production, and we do not publish security advisories if there are issues found on master.

The released versions however are in production use, and any issues there would be critical.

I'm not sure which is preferable for external contributors to fuzz - perhaps both master and latest release? If one or the other, then I suppose master is more useful to us. The released version has already been quite extensively fuzzed, and further effort there is likely to be wasted compute (although there's a small probability a new issue could be uncovered).

@NWilson
Copy link
Member

NWilson commented Nov 13, 2024

@PhilipHazel There appears to be an issue in the new (unreleased) scan_substring feature. It writes out more units to uint32_t* parsed_pattern than it consumes from ptr, which violates the assumption that the writes to parsed_pattern are strictly bounded by the input size.

@carenas
Copy link
Contributor

carenas commented Nov 13, 2024

scan_substring feature

@zherczeg: ping just in case you missed this somehow and are not in the middle of pushing a fix.

@zherczeg
Copy link
Collaborator

Thank you!

It writes 3 numbers into the stream:
https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_compile.c#L4304

The i value should be moved into the meta code.

@NWilson
Copy link
Member

NWilson commented Nov 13, 2024

But PUTOFFSET write 2×uint32_t, so currently the code is writing 4 units. The worst-case pattern is "(scs:(1,1,1,1,1,1,1,...) )". You can only write 2× uint32_t in total, so the offset won't fit, at all. Problem.

The same issue appears in the read_name() branch, but it's fixable there. For '','','','',.... you get 3× uint32_t for each item, and the namelen is always less than MAX_NAME_SIZE, so it fits in the META. You'd get META_SCS_NEXT_NAME | namelen followed by the offset, for a total of 3×uint32_t.

@zherczeg
Copy link
Collaborator

True. Btw, names cannot be empty string. Anyway a relative offset could help. If the next item is < 65535 characters away from the previous item, we encode a relative diff. Otherwise we have a huge buffer area.

@NWilson
Copy link
Member

NWilson commented Nov 13, 2024

I assume you're working on a fix (or will do). I won't make a PR - but if you're busy I'd be happy to.

@carenas
Copy link
Contributor

carenas commented Nov 13, 2024

AFAIK we are still far from an RC, so we should have plenty of time to get a fix out (we are all mostly volunteers after all).

@NWilson: if you want to work on some fixes from your own code instead there is #562, which is also pending and was mentioned above.

@zherczeg
Copy link
Collaborator

I have already put something together. We should focus on landing the open PRs. as well.

@zherczeg
Copy link
Collaborator

#563

@NWilson
Copy link
Member

NWilson commented Nov 15, 2024

After Zoltan's PR, I can do a teeny extra PR to add an assertion. This bug slipped in, because we check that parsed_pattern < end, but our unit tests will never hit that. We actually want an additional ifdef PCRE2_DEBUG block which checks that parsed_pattern - parsed_pattern_last <= ptr - ptr_last. We don't want to risk making this mistake in future.

@NWilson
Copy link
Member

NWilson commented Nov 20, 2024

This can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants