diff --git a/README.md b/README.md index bd2784ae9a..2e24a53492 100644 --- a/README.md +++ b/README.md @@ -246,7 +246,7 @@ anything below 9 is not recommended. +--------------------------------+ | if you want to instrument only | -> use afl-gcc-fast and afl-gcc-fast++ | parts of the target | see [gcc_plugin/README.md](gcc_plugin/README.md) and - +--------------------------------+ [gcc_plugin/README.instrument_file.md](gcc_plugin/README.instrument_file.md) + +--------------------------------+ [gcc_plugin/README.instrument_list.md](gcc_plugin/README.instrument_list.md) | | if not, or if you do not have a gcc with plugin support | @@ -290,12 +290,18 @@ selectively only instrument parts of the target that you are interested in: create a file with all the filenames of the source code that should be instrumented. For afl-clang-lto and afl-gcc-fast - or afl-clang-fast if either the clang - version is < 7 or the CLASSIC instrumentation is used - just put one - filename per line, no directory information necessary, and set - `export AFL_LLVM_INSTRUMENT_FILE=yourfile.txt` - see [llvm_mode/README.instrument_file.md](llvm_mode/README.instrument_file.md) + version is below 7 or the CLASSIC instrumentation is used - just put one + filename or function per line (no directory information necessary for + filenames9, and either set `export AFL_LLVM_ALLOWLIST=allowlist.txt` **or** + `export AFL_LLVM_DENYLIST=denylist.txt` - depending on if you want per + default to instrument unless noted (DENYLIST) or not perform instrumentation + unless requested (ALLOWLIST). + **NOTE:** In optimization functions might be inlined and then not match! + see [llvm_mode/README.instrument_list.md](llvm_mode/README.instrument_list.md) For afl-clang-fast > 6.0 or if PCGUARD instrumentation is used then use the llvm sancov allow-list feature: [http://clang.llvm.org/docs/SanitizerCoverage.html](http://clang.llvm.org/docs/SanitizerCoverage.html) + The llvm sancov format works with the allowlist/denylist feature of afl++ + however afl++ is more flexible in the format. There are many more options and modes available however these are most of the time less effective. See: diff --git a/TODO.md b/TODO.md index 999cb9d3cd..e81b82a342 100644 --- a/TODO.md +++ b/TODO.md @@ -2,7 +2,6 @@ ## Roadmap 2.67+ - - expand on AFL_LLVM_INSTRUMENT_FILE to also support sancov allowlist format - AFL_MAP_SIZE for qemu_mode and unicorn_mode - CPU affinity for many cores? There seems to be an issue > 96 cores diff --git a/docs/Changelog.md b/docs/Changelog.md index ae7377f2eb..f98f8b9bbb 100644 --- a/docs/Changelog.md +++ b/docs/Changelog.md @@ -22,6 +22,10 @@ sending a mail to . - fixed a bug in redqueen for strings - llvm_mode: - now supports llvm 12! + - support for AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST (previous + AFL_LLVM_WHITELIST and AFL_LLVM_INSTRUMENT_FILE are deprecated and + are matched to AFL_LLVM_ALLOWLIST). The format is compatible to llvm + sancov, and also supports function matching! - fixes for laf-intel float splitting (thanks to mark-griffin for reporting) - LTO: autodictionary mode is a default diff --git a/docs/FAQ.md b/docs/FAQ.md index c15cd484b9..33ce49e65b 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -117,7 +117,7 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation! Identify which source code files contain the functions that you need to remove from instrumentation. - Simply follow this document on how to do this: [llvm_mode/README.instrument_file.md](llvm_mode/README.instrument_file.md) + Simply follow this document on how to do this: [llvm_mode/README.instrument_list.md](llvm_mode/README.instrument_list.md) If PCGUARD is used, then you need to follow this guide (needs llvm 12+!): [http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation](http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation) diff --git a/docs/env_variables.md b/docs/env_variables.md index 811c5658ea..f0ae0b6c41 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -202,14 +202,15 @@ Then there are a few specific features that are only available in llvm_mode: See llvm_mode/README.laf-intel.md for more information. -### INSTRUMENT_FILE +### INSTRUMENT LIST (selectively instrument files and functions) This feature allows selectively instrumentation of the source - - Setting AFL_LLVM_INSTRUMENT_FILE with a filename will only instrument those - files that match the names listed in this file. + - Setting AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST with a filenames and/or + function will only instrument (or skip) those files that match the names + listed in the specified file. - See llvm_mode/README.instrument_file.md for more information. + See llvm_mode/README.instrument_list.md for more information. ### NOT_ZERO @@ -241,7 +242,7 @@ Then there are a few specific features that are only available in the gcc_plugin - Setting AFL_GCC_INSTRUMENT_FILE with a filename will only instrument those files that match the names listed in this file (one filename per line). - See gcc_plugin/README.instrument_file.md for more information. + See gcc_plugin/README.instrument_list.md for more information. ## 3) Settings for afl-fuzz diff --git a/docs/perf_tips.md b/docs/perf_tips.md index 7a690b776b..731dc23884 100644 --- a/docs/perf_tips.md +++ b/docs/perf_tips.md @@ -67,7 +67,7 @@ to get to the important parts in the code. If you are only interested in specific parts of the code being fuzzed, you can instrument_files the files that are actually relevant. This improves the speed and -accuracy of afl. See llvm_mode/README.instrument_file.md +accuracy of afl. See llvm_mode/README.instrument_list.md Also use the InsTrim mode on larger binaries, this improves performance and coverage a lot. diff --git a/gcc_plugin/GNUmakefile b/gcc_plugin/GNUmakefile index 4a4f0dcd64..f10a6c1ded 100644 --- a/gcc_plugin/GNUmakefile +++ b/gcc_plugin/GNUmakefile @@ -163,7 +163,7 @@ install: all install -m 755 ../afl-gcc-fast $${DESTDIR}$(BIN_PATH) install -m 755 ../afl-gcc-pass.so ../afl-gcc-rt.o $${DESTDIR}$(HELPER_PATH) install -m 644 -T README.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.md - install -m 644 -T README.instrument_file.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md + install -m 644 -T README.instrument_list.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md clean: rm -f *.o *.so *~ a.out core core.[1-9][0-9]* test-instr .test-instr0 .test-instr1 .test2 diff --git a/gcc_plugin/Makefile b/gcc_plugin/Makefile index f720112f80..c088b61cd2 100644 --- a/gcc_plugin/Makefile +++ b/gcc_plugin/Makefile @@ -152,7 +152,7 @@ install: all install -m 755 ../afl-gcc-fast $${DESTDIR}$(BIN_PATH) install -m 755 ../afl-gcc-pass.so ../afl-gcc-rt.o $${DESTDIR}$(HELPER_PATH) install -m 644 -T README.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.md - install -m 644 -T README.instrument_file.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md + install -m 644 -T README.instrument_list.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md clean: rm -f *.o *.so *~ a.out core core.[1-9][0-9]* test-instr .test-instr0 .test-instr1 .test2 diff --git a/include/envs.h b/include/envs.h index 7153ed476a..96ae91ba50 100644 --- a/include/envs.h +++ b/include/envs.h @@ -62,6 +62,9 @@ static char *afl_environment_variables[] = { "AFL_REAL_LD", "AFL_LD_PRELOAD", "AFL_LD_VERBOSE", + "AFL_LLVM_ALLOWLIST", + "AFL_LLVM_DENYLIST", + "AFL_LLVM_BLOCKLIST", "AFL_LLVM_CMPLOG", "AFL_LLVM_INSTRIM", "AFL_LLVM_CTX", diff --git a/llvm_mode/README.instrument_file.md b/llvm_mode/README.instrument_file.md deleted file mode 100644 index 46e45ba25b..0000000000 --- a/llvm_mode/README.instrument_file.md +++ /dev/null @@ -1,81 +0,0 @@ -# Using afl++ with partial instrumentation - - This file describes how you can selectively instrument only the source files - that are interesting to you using the LLVM instrumentation provided by - afl++ - - Originally developed by Christian Holler (:decoder) . - -## 1) Description and purpose - -When building and testing complex programs where only a part of the program is -the fuzzing target, it often helps to only instrument the necessary parts of -the program, leaving the rest uninstrumented. This helps to focus the fuzzer -on the important parts of the program, avoiding undesired noise and -disturbance by uninteresting code being exercised. - -For this purpose, I have added a "partial instrumentation" support to the LLVM -mode of AFLFuzz that allows you to specify on a source file level which files -should be compiled with or without instrumentation. - -Note: When using PCGUARD mode - and have llvm 12+ - you can use this instead: -https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation - -## 2) Building the LLVM module - -The new code is part of the existing afl++ LLVM module in the llvm_mode/ -subdirectory. There is nothing specifically to do :) - - -## 3) How to use the partial instrumentation mode - -In order to build with partial instrumentation, you need to build with -afl-clang-fast and afl-clang-fast++ respectively. The only required change is -that you need to set the environment variable AFL_LLVM_INSTRUMENT_FILE when calling -the compiler. - -The environment variable must point to a file containing all the filenames -that should be instrumented. For matching, the filename that is being compiled -must end in the filename entry contained in this the instrument file list (to avoid breaking -the matching when absolute paths are used during compilation). - -For example if your source tree looks like this: - -``` -project/ -project/feature_a/a1.cpp -project/feature_a/a2.cpp -project/feature_b/b1.cpp -project/feature_b/b2.cpp -``` - -and you only want to test feature_a, then create a the instrument file list file containing: - -``` -feature_a/a1.cpp -feature_a/a2.cpp -``` - -However if the instrument file list file contains only this, it works as well: - -``` -a1.cpp -a2.cpp -``` - -but it might lead to files being unwantedly instrumented if the same filename -exists somewhere else in the project directories. - -The created the instrument file list file is then set to AFL_LLVM_INSTRUMENT_FILE when you compile -your program. For each file that didn't match the the instrument file list, the compiler will -issue a warning at the end stating that no blocks were instrumented. If you -didn't intend to instrument that file, then you can safely ignore that warning. - -For old LLVM versions this feature might require to be compiled with debug -information (-g), however at least from llvm version 6.0 onwards this is not -required anymore (and might hurt performance and crash detection, so better not -use -g). - -## 4) UNIX-style filename pattern matching -You can add UNIX-style pattern matching in the the instrument file list entries. See `man -fnmatch` for the syntax. We do not set any of the `fnmatch` flags. diff --git a/llvm_mode/README.instrument_list.md b/llvm_mode/README.instrument_list.md new file mode 100644 index 0000000000..b0e0cc1e99 --- /dev/null +++ b/llvm_mode/README.instrument_list.md @@ -0,0 +1,86 @@ +# Using afl++ with partial instrumentation + + This file describes how you can selectively instrument only the source files + or functions that are interesting to you using the LLVM instrumentation + provided by afl++ + +## 1) Description and purpose + +When building and testing complex programs where only a part of the program is +the fuzzing target, it often helps to only instrument the necessary parts of +the program, leaving the rest uninstrumented. This helps to focus the fuzzer +on the important parts of the program, avoiding undesired noise and +disturbance by uninteresting code being exercised. + +For this purpose, a "partial instrumentation" support en par with llvm sancov +is provided by afl++ that allows you to specify on a source file and function +level which should be compiled with or without instrumentation. + +Note: When using PCGUARD mode - and have llvm 12+ - you can use this instead: +https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation + +the llvm sancov list format is fully supported by afl++, however afl++ has +more flexbility. + +## 2) Building the LLVM module + +The new code is part of the existing afl++ LLVM module in the llvm_mode/ +subdirectory. There is nothing specifically to do :) + +## 3) How to use the partial instrumentation mode + +In order to build with partial instrumentation, you need to build with +afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++. +The only required change is that you need to set either the environment variable +AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST set with a filename. + +That file then contains the filenames or functions that should be instrumented +(AFL_LLVM_ALLOWLIST) or should specifically NOT instrumentd (AFL_LLVM_DENYLIST). + +For matching, the function/filename that is being compiled must end in the +function/filename entry contained in this the instrument file list (to avoid +breaking the matching when absolute paths are used during compilation). + +**NOTE:** In optimization functions might be inlined and then not match! + +For example if your source tree looks like this: +``` +project/ +project/feature_a/a1.cpp +project/feature_a/a2.cpp +project/feature_b/b1.cpp +project/feature_b/b2.cpp +``` + +and you only want to test feature_a, then create a the instrument file list file containing: +``` +feature_a/a1.cpp +feature_a/a2.cpp +``` + +However if the instrument file list file contains only this, it works as well: +``` +a1.cpp +a2.cpp +``` +but it might lead to files being unwantedly instrumented if the same filename +exists somewhere else in the project directories. + +You can also specify function names. Note that for C++ the function names +must be mangled to match! + +afl++ is intelligent to identify if an entry is a filename or a function. +However if you want to be sure (and compliant to the sancov allow/blocklist +format), you can file entries like this: +``` +src: *malloc.c +``` +and function entries like this: +``` +fun: MallocFoo +``` +Note that whitespace is ignored and comments (`# foo`) supported. + +## 4) UNIX-style pattern matching +You can add UNIX-style pattern matching in the the instrument file list entries. +See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags. diff --git a/llvm_mode/README.lto.md b/llvm_mode/README.lto.md index e521ac8298..4d643324f6 100644 --- a/llvm_mode/README.lto.md +++ b/llvm_mode/README.lto.md @@ -108,15 +108,12 @@ make install Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc. -Also the instrument file listing (AFL_LLVM_INSTRUMENT_FILE -> [README.instrument_file.md](README.instrument_file.md)) and +Also the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST -> [README.instrument_list.md](README.instrument_list.md)) and laf-intel/compcov (AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work. -InsTrim (control flow graph instrumentation) is supported and recommended! - (set `AFL_LLVM_INSTRUMENT=CFG`) Example: ``` CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure -export AFL_LLVM_INSTRUMENT=CFG make ``` diff --git a/llvm_mode/README.md b/llvm_mode/README.md index 22088dfd5f..f23d7150f7 100644 --- a/llvm_mode/README.md +++ b/llvm_mode/README.md @@ -109,7 +109,7 @@ Several options are present to make llvm_mode faster or help it rearrange the code to make afl-fuzz path discovery easier. If you need just to instrument specific parts of the code, you can the instrument file list -which C/C++ files to actually instrument. See [README.instrument_file](README.instrument_file.md) +which C/C++ files to actually instrument. See [README.instrument_list](README.instrument_list.md) For splitting memcmp, strncmp, etc. please see [README.laf-intel](README.laf-intel.md) diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c index ef99e3f324..f75adf1ede 100644 --- a/llvm_mode/afl-clang-fast.c +++ b/llvm_mode/afl-clang-fast.c @@ -229,7 +229,8 @@ static void edit_params(u32 argc, char **argv, char **envp) { if (lto_mode) { if (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL || - getenv("AFL_LLVM_WHITELIST")) { + getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") || + getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) { cc_params[cc_par_cnt++] = "-Xclang"; cc_params[cc_par_cnt++] = "-load"; @@ -637,9 +638,13 @@ int main(int argc, char **argv, char **envp) { } - if ((getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST")) && + if ((getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL || + getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") || + getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) && getenv("AFL_DONT_OPTIMIZE")) - FATAL("AFL_LLVM_INSTRUMENT_FILE and AFL_DONT_OPTIMIZE cannot be combined"); + WARNF( + "AFL_LLVM_ALLOWLIST/DENYLIST and AFL_DONT_OPTIMIZE cannot be combined " + "for file matching, only function matching!"); if (getenv("AFL_LLVM_INSTRIM") || getenv("INSTRIM") || getenv("INSTRIM_LIB")) { @@ -787,15 +792,17 @@ int main(int argc, char **argv, char **envp) { #if LLVM_VERSION_MAJOR <= 6 instrument_mode = INSTRUMENT_AFL; #else - if (getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST")) { + if (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL || + getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") || + getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) { instrument_mode = INSTRUMENT_AFL; WARNF( "switching to classic instrumentation because " - "AFL_LLVM_INSTRUMENT_FILE does not work with PCGUARD. Use " - "-fsanitize-coverage-allowlist=allowlist.txt if you want to use " - "PCGUARD. Requires llvm 12+. See " - "https://clang.llvm.org/docs/" + "AFL_LLVM_ALLOWLIST/DENYLIST does not work with PCGUARD. Use " + "-fsanitize-coverage-allowlist=allowlist.txt or " + "-fsanitize-coverage-blocklist=denylist.txt if you want to use " + "PCGUARD. Requires llvm 12+. See https://clang.llvm.org/docs/ " "SanitizerCoverage.html#partially-disabling-instrumentation"); } else @@ -846,11 +853,14 @@ int main(int argc, char **argv, char **envp) { "together"); if (instrument_mode == INSTRUMENT_PCGUARD && - (getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST"))) + (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL || + getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") || + getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST"))) FATAL( "Instrumentation type PCGUARD does not support " - "AFL_LLVM_INSTRUMENT_FILE! Use " - "-fsanitize-coverage-allowlist=allowlist.txt instead (requires llvm " + "AFL_LLVM_ALLOWLIST/DENYLIST! Use " + "-fsanitize-coverage-allowlist=allowlist.txt or " + "-fsanitize-coverage-blocklist=denylist.txt instead (requires llvm " "12+), see " "https://clang.llvm.org/docs/" "SanitizerCoverage.html#partially-disabling-instrumentation"); diff --git a/llvm_mode/afl-llvm-common.cc b/llvm_mode/afl-llvm-common.cc index 9a884ded22..0b89c3b4da 100644 --- a/llvm_mode/afl-llvm-common.cc +++ b/llvm_mode/afl-llvm-common.cc @@ -20,7 +20,10 @@ using namespace llvm; -static std::list myInstrumentList; +static std::list allowListFiles; +static std::list allowListFunctions; +static std::list denyListFiles; +static std::list denyListFunctions; char *getBBName(const llvm::BasicBlock *BB) { @@ -87,30 +90,166 @@ bool isIgnoreFunction(const llvm::Function *F) { void initInstrumentList() { - char *instrumentListFilename = getenv("AFL_LLVM_INSTRUMENT_FILE"); - if (!instrumentListFilename) - instrumentListFilename = getenv("AFL_LLVM_WHITELIST"); + char *allowlist = getenv("AFL_LLVM_ALLOWLIST"); + if (!allowlist) allowlist = getenv("AFL_LLVM_INSTRUMENT_FILE"); + if (!allowlist) allowlist = getenv("AFL_LLVM_WHITELIST"); + char *denylist = getenv("AFL_LLVM_DENYLIST"); + if (!denylist) denylist = getenv("AFL_LLVM_BLOCKLIST"); - if (instrumentListFilename) { + if (allowlist && denylist) + FATAL( + "You can only specify either AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST " + "but not both!"); + + if (allowlist) { std::string line; std::ifstream fileStream; - fileStream.open(instrumentListFilename); - if (!fileStream) - report_fatal_error("Unable to open AFL_LLVM_INSTRUMENT_FILE"); + fileStream.open(allowlist); + if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_ALLOWLIST"); getline(fileStream, line); + while (fileStream) { - myInstrumentList.push_back(line); - getline(fileStream, line); + int is_file = -1; + std::size_t npos; + std::string original_line = line; + + line.erase(std::remove_if(line.begin(), line.end(), ::isspace), + line.end()); + + // remove # and following + if ((npos = line.find("#")) != std::string::npos) + line = line.substr(0, npos); + + if (line.compare(0, 4, "fun:") == 0) { + + is_file = 0; + line = line.substr(4); + + } else if (line.compare(0, 9, "function:") == 0) { + + is_file = 0; + line = line.substr(9); + + } else if (line.compare(0, 4, "src:") == 0) { + + is_file = 1; + line = line.substr(4); + + } else if (line.compare(0, 7, "source:") == 0) { + + is_file = 1; + line = line.substr(7); + + } + + if (line.find(":") != std::string::npos) { + + FATAL("invalid line in AFL_LLVM_ALLOWLIST: %s", original_line.c_str()); + + } + + if (line.length() > 0) { + + // if the entry contains / or . it must be a file + if (is_file == -1) + if (line.find("/") != std::string::npos || + line.find(".") != std::string::npos) + is_file = 1; + // otherwise it is a function + + if (is_file == 1) + allowListFiles.push_back(line); + else + allowListFunctions.push_back(line); + getline(fileStream, line); + + } } + if (debug) + SAYF(cMGN "[D] " cRST + "loaded allowlist with %zu file and %zu function entries\n", + allowListFiles.size(), allowListFunctions.size()); + } - if (debug) - SAYF(cMGN "[D] " cRST "loaded instrument list with %zu entries\n", - myInstrumentList.size()); + if (denylist) { + + std::string line; + std::ifstream fileStream; + fileStream.open(denylist); + if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_DENYLIST"); + getline(fileStream, line); + + while (fileStream) { + + int is_file = -1; + std::size_t npos; + std::string original_line = line; + + line.erase(std::remove_if(line.begin(), line.end(), ::isspace), + line.end()); + + // remove # and following + if ((npos = line.find("#")) != std::string::npos) + line = line.substr(0, npos); + + if (line.compare(0, 4, "fun:") == 0) { + + is_file = 0; + line = line.substr(4); + + } else if (line.compare(0, 9, "function:") == 0) { + + is_file = 0; + line = line.substr(9); + + } else if (line.compare(0, 4, "src:") == 0) { + + is_file = 1; + line = line.substr(4); + + } else if (line.compare(0, 7, "source:") == 0) { + + is_file = 1; + line = line.substr(7); + + } + + if (line.find(":") != std::string::npos) { + + FATAL("invalid line in AFL_LLVM_DENYLIST: %s", original_line.c_str()); + + } + + if (line.length() > 0) { + + // if the entry contains / or . it must be a file + if (is_file == -1) + if (line.find("/") != std::string::npos || + line.find(".") != std::string::npos) + is_file = 1; + // otherwise it is a function + + if (is_file == 1) + denyListFiles.push_back(line); + else + denyListFunctions.push_back(line); + getline(fileStream, line); + + } + + } + + if (debug) + SAYF(cMGN "[D] " cRST + "loaded denylist with %zu file and %zu function entries\n", + denyListFiles.size(), denyListFunctions.size()); + + } } @@ -121,42 +260,173 @@ bool isInInstrumentList(llvm::Function *F) { if (!F->size() || isIgnoreFunction(F)) return false; // if we do not have a the instrument file list return true - if (myInstrumentList.empty()) return true; + if (!allowListFiles.empty() || !allowListFunctions.empty()) { + + if (!allowListFunctions.empty()) { + + std::string instFunction = F->getName().str(); + + for (std::list::iterator it = allowListFunctions.begin(); + it != allowListFunctions.end(); ++it) { + + /* We don't check for filename equality here because + * filenames might actually be full paths. Instead we + * check that the actual filename ends in the filename + * specified in the list. We also allow UNIX-style pattern + * matching */ - // let's try to get the filename for the function - auto bb = &F->getEntryBlock(); - BasicBlock::iterator IP = bb->getFirstInsertionPt(); - IRBuilder<> IRB(&(*IP)); - DebugLoc Loc = IP->getDebugLoc(); + if (instFunction.length() >= it->length()) { + + if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) { + + if (debug) + SAYF(cMGN "[D] " cRST + "Function %s is in the allow function list, " + "instrumenting ... \n", + instFunction.c_str()); + return true; + + } + + } + + } + + } + + if (!allowListFiles.empty()) { + + // let's try to get the filename for the function + auto bb = &F->getEntryBlock(); + BasicBlock::iterator IP = bb->getFirstInsertionPt(); + IRBuilder<> IRB(&(*IP)); + DebugLoc Loc = IP->getDebugLoc(); #if LLVM_VERSION_MAJOR >= 4 || \ (LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7) - if (Loc) { + if (Loc) { + + DILocation *cDILoc = dyn_cast(Loc.getAsMDNode()); + + unsigned int instLine = cDILoc->getLine(); + StringRef instFilename = cDILoc->getFilename(); - DILocation *cDILoc = dyn_cast(Loc.getAsMDNode()); + if (instFilename.str().empty()) { - unsigned int instLine = cDILoc->getLine(); - StringRef instFilename = cDILoc->getFilename(); + /* If the original location is empty, try using the inlined location + */ + DILocation *oDILoc = cDILoc->getInlinedAt(); + if (oDILoc) { + + instFilename = oDILoc->getFilename(); + instLine = oDILoc->getLine(); + + } + + } - if (instFilename.str().empty()) { + /* Continue only if we know where we actually are */ + if (!instFilename.str().empty()) { - /* If the original location is empty, try using the inlined location - */ - DILocation *oDILoc = cDILoc->getInlinedAt(); - if (oDILoc) { + for (std::list::iterator it = allowListFiles.begin(); + it != allowListFiles.end(); ++it) { - instFilename = oDILoc->getFilename(); - instLine = oDILoc->getLine(); + /* We don't check for filename equality here because + * filenames might actually be full paths. Instead we + * check that the actual filename ends in the filename + * specified in the list. We also allow UNIX-style pattern + * matching */ + + if (instFilename.str().length() >= it->length()) { + + if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == + 0) { + + if (debug) + SAYF(cMGN "[D] " cRST + "Function %s is in the allowlist (%s), " + "instrumenting ... \n", + F->getName().str().c_str(), instFilename.str().c_str()); + return true; + + } + + } + + } + + } + + } + + } + +#else + if (!Loc.isUnknown()) { + + DILocation cDILoc(Loc.getAsMDNode(F->getContext())); + + unsigned int instLine = cDILoc.getLineNumber(); + StringRef instFilename = cDILoc.getFilename(); + + (void)instLine; + /* Continue only if we know where we actually are */ + if (!instFilename.str().empty()) { + + for (std::list::iterator it = allowListFiles.begin(); + it != allowListFiles.end(); ++it) { + + /* We don't check for filename equality here because + * filenames might actually be full paths. Instead we + * check that the actual filename ends in the filename + * specified in the list. We also allow UNIX-style pattern + * matching */ + + if (instFilename.str().length() >= it->length()) { + + if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == + 0) { + + return true; + + } + + } + + } + + } } } - /* Continue only if we know where we actually are */ - if (!instFilename.str().empty()) { +#endif + else { + + // we could not find out the location. in this case we say it is not + // in the the instrument file list + if (!be_quiet) + WARNF( + "No debug information found for function %s, will not be " + "instrumented (recompile with -g -O[1-3]).", + F->getName().str().c_str()); + return false; + + } + + return false; + + } + + if (!denyListFiles.empty() || !denyListFunctions.empty()) { + + if (!denyListFunctions.empty()) { - for (std::list::iterator it = myInstrumentList.begin(); - it != myInstrumentList.end(); ++it) { + std::string instFunction = F->getName().str(); + + for (std::list::iterator it = denyListFunctions.begin(); + it != denyListFunctions.end(); ++it) { /* We don't check for filename equality here because * filenames might actually be full paths. Instead we @@ -164,16 +434,16 @@ bool isInInstrumentList(llvm::Function *F) { * specified in the list. We also allow UNIX-style pattern * matching */ - if (instFilename.str().length() >= it->length()) { + if (instFunction.length() >= it->length()) { - if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == - 0) { + if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) { if (debug) SAYF(cMGN "[D] " cRST - "Function %s is in the list (%s), instrumenting ... \n", - F->getName().str().c_str(), instFilename.str().c_str()); - return true; + "Function %s is in the deny function list, " + "not instrumenting ... \n", + instFunction.c_str()); + return false; } @@ -183,35 +453,64 @@ bool isInInstrumentList(llvm::Function *F) { } - } + if (!denyListFiles.empty()) { -#else - if (!Loc.isUnknown()) { + // let's try to get the filename for the function + auto bb = &F->getEntryBlock(); + BasicBlock::iterator IP = bb->getFirstInsertionPt(); + IRBuilder<> IRB(&(*IP)); + DebugLoc Loc = IP->getDebugLoc(); - DILocation cDILoc(Loc.getAsMDNode(F->getContext())); +#if LLVM_VERSION_MAJOR >= 4 || \ + (LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7) + if (Loc) { - unsigned int instLine = cDILoc.getLineNumber(); - StringRef instFilename = cDILoc.getFilename(); + DILocation *cDILoc = dyn_cast(Loc.getAsMDNode()); - (void)instLine; - /* Continue only if we know where we actually are */ - if (!instFilename.str().empty()) { + unsigned int instLine = cDILoc->getLine(); + StringRef instFilename = cDILoc->getFilename(); - for (std::list::iterator it = myInstrumentList.begin(); - it != myInstrumentList.end(); ++it) { + if (instFilename.str().empty()) { - /* We don't check for filename equality here because - * filenames might actually be full paths. Instead we - * check that the actual filename ends in the filename - * specified in the list. We also allow UNIX-style pattern - * matching */ + /* If the original location is empty, try using the inlined location + */ + DILocation *oDILoc = cDILoc->getInlinedAt(); + if (oDILoc) { - if (instFilename.str().length() >= it->length()) { + instFilename = oDILoc->getFilename(); + instLine = oDILoc->getLine(); - if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == - 0) { + } - return true; + } + + /* Continue only if we know where we actually are */ + if (!instFilename.str().empty()) { + + for (std::list::iterator it = denyListFiles.begin(); + it != denyListFiles.end(); ++it) { + + /* We don't check for filename equality here because + * filenames might actually be full paths. Instead we + * check that the actual filename ends in the filename + * specified in the list. We also allow UNIX-style pattern + * matching */ + + if (instFilename.str().length() >= it->length()) { + + if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == + 0) { + + if (debug) + SAYF(cMGN "[D] " cRST + "Function %s is in the denylist (%s), not " + "instrumenting ... \n", + F->getName().str().c_str(), instFilename.str().c_str()); + return false; + + } + + } } @@ -221,23 +520,65 @@ bool isInInstrumentList(llvm::Function *F) { } - } +#else + if (!Loc.isUnknown()) { + + DILocation cDILoc(Loc.getAsMDNode(F->getContext())); + + unsigned int instLine = cDILoc.getLineNumber(); + StringRef instFilename = cDILoc.getFilename(); + + (void)instLine; + /* Continue only if we know where we actually are */ + if (!instFilename.str().empty()) { + + for (std::list::iterator it = denyListFiles.begin(); + it != denyListFiles.end(); ++it) { + + /* We don't check for filename equality here because + * filenames might actually be full paths. Instead we + * check that the actual filename ends in the filename + * specified in the list. We also allow UNIX-style pattern + * matching */ + + if (instFilename.str().length() >= it->length()) { + + if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == + 0) { + + return false; + + } + + } + + } + + } + + } + + } #endif - else { - - // we could not find out the location. in this case we say it is not - // in the the instrument file list - if (!be_quiet) - WARNF( - "No debug information found for function %s, will not be " - "instrumented (recompile with -g -O[1-3]).", - F->getName().str().c_str()); - return false; + else { + + // we could not find out the location. in this case we say it is not + // in the the instrument file list + if (!be_quiet) + WARNF( + "No debug information found for function %s, will be " + "instrumented (recompile with -g -O[1-3]).", + F->getName().str().c_str()); + return true; + + } + + return true; } - return false; + return true; // not reached } diff --git a/llvm_mode/afl-llvm-lto-instrumentlist.so.cc b/llvm_mode/afl-llvm-lto-instrumentlist.so.cc index ab7c0c580a..a7331444da 100644 --- a/llvm_mode/afl-llvm-lto-instrumentlist.so.cc +++ b/llvm_mode/afl-llvm-lto-instrumentlist.so.cc @@ -59,39 +59,9 @@ class AFLcheckIfInstrument : public ModulePass { static char ID; AFLcheckIfInstrument() : ModulePass(ID) { - int entries = 0; - if (getenv("AFL_DEBUG")) debug = 1; - char *instrumentListFilename = getenv("AFL_LLVM_INSTRUMENT_FILE"); - if (!instrumentListFilename) - instrumentListFilename = getenv("AFL_LLVM_WHITELIST"); - if (instrumentListFilename) { - - std::string line; - std::ifstream fileStream; - fileStream.open(instrumentListFilename); - if (!fileStream) - report_fatal_error("Unable to open AFL_LLVM_INSTRUMENT_FILE"); - getline(fileStream, line); - while (fileStream) { - - myInstrumentList.push_back(line); - getline(fileStream, line); - entries++; - - } - - } else - - PFATAL( - "afl-llvm-lto-instrumentlist.so loaded without " - "AFL_LLVM_INSTRUMENT_FILE?!"); - - if (debug) - SAYF(cMGN "[D] " cRST - "loaded the instrument file list %s with %d entries\n", - instrumentListFilename, entries); + initInstrumentList(); } @@ -129,120 +99,28 @@ bool AFLcheckIfInstrument::runOnModule(Module &M) { for (auto &F : M) { if (F.size() < 1) continue; - // fprintf(stderr, "F:%s\n", F.getName().str().c_str()); - if (isIgnoreFunction(&F)) continue; - - BasicBlock::iterator IP = F.getEntryBlock().getFirstInsertionPt(); - IRBuilder<> IRB(&(*IP)); - - if (!myInstrumentList.empty()) { - - bool instrumentFunction = false; - - /* Get the current location using debug information. - * For now, just instrument the block if we are not able - * to determine our location. */ - DebugLoc Loc = IP->getDebugLoc(); - if (Loc) { - - DILocation *cDILoc = dyn_cast(Loc.getAsMDNode()); - - unsigned int instLine = cDILoc->getLine(); - StringRef instFilename = cDILoc->getFilename(); - - if (instFilename.str().empty()) { - - /* If the original location is empty, try using the inlined location - */ - DILocation *oDILoc = cDILoc->getInlinedAt(); - if (oDILoc) { - - instFilename = oDILoc->getFilename(); - instLine = oDILoc->getLine(); - - } - - if (instFilename.str().empty()) { - if (!be_quiet) - WARNF( - "Function %s has no source file name information and will " - "not be instrumented.", - F.getName().str().c_str()); - continue; - - } - - } - - //(void)instLine; - - fprintf(stderr, "xxx %s %s\n", F.getName().str().c_str(), - instFilename.str().c_str()); - if (debug) - SAYF(cMGN "[D] " cRST "function %s is in file %s\n", - F.getName().str().c_str(), instFilename.str().c_str()); - - for (std::list::iterator it = myInstrumentList.begin(); - it != myInstrumentList.end(); ++it) { - - /* We don't check for filename equality here because - * filenames might actually be full paths. Instead we - * check that the actual filename ends in the filename - * specified in the list. */ - if (instFilename.str().length() >= it->length()) { - - if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) == - 0) { - - instrumentFunction = true; - break; - - } - - } - - } - - } else { - - if (!be_quiet) - WARNF( - "No debug information found for function %s, recompile with -g " - "-O[1-3]", - F.getName().str().c_str()); - continue; - - } - - /* Either we couldn't figure out our location or the location is - * not the instrument file listed, so we skip instrumentation. - * We do this by renaming the function. */ - if (instrumentFunction == true) { - - if (debug) - SAYF(cMGN "[D] " cRST "function %s is in the instrument file list\n", - F.getName().str().c_str()); - - } else { - - if (debug) - SAYF(cMGN "[D] " cRST - "function %s is NOT in the instrument file list\n", - F.getName().str().c_str()); + // fprintf(stderr, "F:%s\n", F.getName().str().c_str()); - auto & Ctx = F.getContext(); - AttributeList Attrs = F.getAttributes(); - AttrBuilder NewAttrs; - NewAttrs.addAttribute("skipinstrument"); - F.setAttributes( - Attrs.addAttributes(Ctx, AttributeList::FunctionIndex, NewAttrs)); + if (isInInstrumentList(&F)) { - } + if (debug) + SAYF(cMGN "[D] " cRST "function %s is in the instrument file list\n", + F.getName().str().c_str()); } else { - PFATAL("InstrumentList is empty"); + if (debug) + SAYF(cMGN "[D] " cRST + "function %s is NOT in the instrument file list\n", + F.getName().str().c_str()); + + auto & Ctx = F.getContext(); + AttributeList Attrs = F.getAttributes(); + AttrBuilder NewAttrs; + NewAttrs.addAttribute("skipinstrument"); + F.setAttributes( + Attrs.addAttributes(Ctx, AttributeList::FunctionIndex, NewAttrs)); }