Config to lower level of heuristics EXTRACT.28 and EXTRACT.13 #245

kam193 · 2024-08-12T10:23:23Z

Is your feature request related to a problem? Please describe.
When analysing Python packages, I often came across false positives from those two heuristics:

EXTRACT.28 (Extractable _RDATA section found)
EXTRACT.13 (Single Executable Inside Archive File)

Apparently, it's not an uncommon situation for Python packages compiled for Windows to have executable parts in _RDATA, I also came across multiple DLLs triggering this heuristic (e.g. fe27c4c07c0cfbb2ee28c8409e5a8db89d86c6c2d76c6e3b79ab31979138b215).

The second isn't uncommon as well, and in addition - looks like it has some issues with properly handling binary files from which an executable was extracted. I've noticed it's sometimes triggered when an exe is extracted from another exe, or when it manages to decompile code from a PYC file (but not always):

(__decompiled_source.py comes from my service, Extractor didn't see it)

Describe the solution you'd like

A config option to make EXTRACT.28 and EXTRACT.13 informative.
Improvements to EXTRACT.13 to be triggered only when an executable is extracted from an archive, not another executable or decompiled.

Describe alternatives you've considered

Option to disable heuristics at all - but it would just remove information that could be useful.
Hardcoded lowering score of those heuristics - definitely not, it would break use cases apart from mine.
Generic option to override the default/maximum score of a heuristic by an administrator - this could be interesting and allow adjusting an AL instance to the local use case, but requires much more work.

Additional context
Extract service already has multiple options for adjusting some heuristics, but not those.

The text was updated successfully, but these errors were encountered:

gdesmar · 2024-11-07T19:31:06Z

This is another issue that I missed a while ago.
A couple months ago we modified the score of EXTRACT.28 to reduce it to 200 points instead of 300. That should stop it from causing a suspicious result section.
Regarding EXTRACT.13 (and possibly EXTRACT.16), I just updated the Extract service to not flag any python script coming out of executables. I also added a new structure for false positive detection for file extension, just in case our identification for python script is not perfect... 😅
You should be able to test v4.5.0.stable48 and see if that solves your issues! Thanks for the report!

kam193 · 2024-11-08T18:24:47Z

Thank you! I think it should almost entirely solve my issue. In exchange, I've uploaded new misidentified files 😇 #284

kam193 added assess We still haven't decided if this will be worked on or not enhancement New feature or request labels Aug 12, 2024

cccs-rs assigned cccs-rs and gdesmar Aug 12, 2024

gdesmar added service-extract accepted This issue was accepted, we will work on this at some point and removed assess We still haven't decided if this will be worked on or not labels Nov 6, 2024

kam193 closed this as completed Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config to lower level of heuristics EXTRACT.28 and EXTRACT.13 #245

Config to lower level of heuristics EXTRACT.28 and EXTRACT.13 #245

kam193 commented Aug 12, 2024

gdesmar commented Nov 7, 2024

kam193 commented Nov 8, 2024

Config to lower level of heuristics EXTRACT.28 and EXTRACT.13 #245

Config to lower level of heuristics EXTRACT.28 and EXTRACT.13 #245

Comments

kam193 commented Aug 12, 2024

gdesmar commented Nov 7, 2024

kam193 commented Nov 8, 2024