-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal on the rules #8
Comments
Hi Arnim, Appreciate the feedback and all the detailed suggestions. I usually don’t run these YARA rules on entire drives, I target directories of collected artifacts and logs, automatically ingesting the JSON results into Splunk for analysis. I know it picks up a lot (even strings from Notepad...), but that broad coverage with extensive triage was my intention. That said, you're not the first to mention this, so I’ll set up a dedicated directory or another branch this week with all your modifications to make it more usable ! Thanks again for the input 🙏 Best regards, update: After testing across full disks on multiple OS, I identified several bad keywords that shouldn’t be present. I'm implementing adjustments |
thanks. you could easily create two different rulesets: for hunting with more false positives and a more strict one that might be usable in e.g. yara-forge |
alright new ruleset available here https://github.com/mthcht/ThreatHunting-Keywords-yara-rules/tree/main/yara_rules_binaries_strict @ruppde i would love to hear your feedback on this one |
cool, lots better. you could still improve these 3 warnings at yara-startup:
rule_cobaltstrike_offensive_tool_keyword and rule_nmap_offensive_tool_keyword probably just contain too many strings. I've used the rule on a bunch of cobalt strike samples and there are the only 4 strings found:
ok, doesn't mean the rest useless, might be found in some BOF or whatever. some strings are redundant, e.g. maybe just remove string19_DynastyPersist_offensive_tool_keyword if yara is slowed down by the \s (it can't find any 4 byte atom for aho corasick (https://github.com/Neo23x0/YARA-Performance-Guidelines#1-compiling-the-rules) from the remaining false positives it might be better to just remove the rules for so common linux tools as whoami, dd and wireshark? binwalk is a tool that mostly used by researchers and no attacker would use that on a victims machine. so unless someone really get his hands on a real attacker machine, there will be many many false positives before the rule is any use. there are also many regexes, which could be normal strings, e.g.:
so if there's no reason to use a regex, maybe just generate: |
Thanks! I already planned to only use regex for patterns that require it (mainly those with wildcards in the middle). For the rest, I'll use strings since it's much faster ! I'll remove binwalk as suggested. For the greyware tools, if you prefer not to detect them, you can use offensive_tools.yara, which only includes offensive tools. The all.yara file will contain everything: offensive tools, greyware, and additional signatures. Regarding the Linux-specific modifications, I'll need to first categorize the patterns that apply solely to Linux systems. It might take some time, but I'll get it done. As for the three problematic rules, I'll see what I can do! thanks for pointing them out! todo:
|
hi,
did a test drive with your yara rules and while they find malware and nasty things, they just produce too many false positives, to be usable. The ReactOS live CD has 144 hits, the /usr/sbin of debian has 125 hits. So scanning a complete hard drive of a system infected with maybe 3 malware files would produce something like 10.000 false positives. there's just no way to find those needles in the haystack.
proposals to bring that number down:
ThreatHunting-Keywords-yara-rules/_utils/create_yara_rules.py
Line 110 in 1523732
any of them
to2 of them
2 of them
. for example there isstring18_cat_greyware_tool_keyword: cat /etc/passwd
, which will also be in lots of legitimate scripts. but if there's alsostring31_net_greyware_tool_keyword: net localgroup admin
in the same file, that's rather unusual.filesize < 10MB
to avoid matching on huge legitimate files, which just contain many many strings( linux ELF is
uint16(0) == 0x457f
, macos( uint32be(0) == 0x7f454c46 or uint16(0) == 0xfeca or uint16(0) == 0xfacf or uint32(0) == 0xbebafeca )
)some repos for testing:
malware:
https://github.com/Flangvik/SharpCollection
https://github.com/tennc/webshell
goodware:
ReactOS LiveCD: https://reactos.org/download/
any linux live DVD
sorry, that's a bunch of worky, but I think it's really needed to make this project usable.
best regards
arnim
The text was updated successfully, but these errors were encountered: