Part 6: Exercise – Fuzz a Different Parser
In this exercise I suggest you target mime_disposition_new from mailscanner and go through the full process:
reverse-engineer it, fill in the harness skeleton, add some starter seeds, fuzz it, and analyze the results.
You’re welcome to try a different function, but this one seems similar to the one we already fuzzed so it shouldn’t be as difficult to get started with.
Background
mime_disposition_new parses Content-Disposition headers such as:
attachment; filename="document.pdf"
inline; filename="image.png"
Step 1: Find the symbols
Before you begin, you should be in
/home/cisco/guides/day-2/exerciseand you should run./setup_exercise.sh.
nm -D mailscanner.so | grep mime_disposition
Note down the parse (*_new) and destroy symbol names.
Step 2: Figure out the function signature
Read the disassembly of the function:
r2 -e asm.bytes=0 -a x86 -b 32 -qc 'aaa; s sym.mime_disposition_new; pdf' mailscanner.so
It’s also worth taking a look in Ghidra at the pseudocode/disasssembly for a clearer picture.
How many arguments? What types? How does it compare the the functions we’ve already looked at?
Also confirm there’s a _destroy function you’ll need to call after each iteration.
Step 3: Fill in the harness
A fuzzer skeleton is at harness_exercise.c with four TODOs:
- Function pointer type definitions (argument types)
dlsymsymbol names- Call the parse function with the right arguments
- Call destroy on the result
Refer to ../harness/harness_ct.c for the pattern.
Step 4: Create seeds
Create some seeds in ./corpus_exercise/. At least have one named attachment.txt that uses the attachment syntax.
If you really don’t feel like making a realistic seed, just echo whatever > corpus_exercise/attachment.txt
Step 5: Create a dictionary
Similar to Step 5, you can probably look up some examples of Content-Disposition headers online and get some ideas of tokens that would be good to include.
If you found some, add each one like:
"some token here"
"another\x0d\x0atoken here"
"last token"
to the dict/disposition.dict file.
Step 6: Compile and test
./setup_exercise.sh # You should have already run this
clang -m32 -g -O2 -o harness_exercise harness_exercise.c -ldl
./harness_exercise corpus_exercise/attachment.txt # MAKE SURE you have made a corpus_exercise/attachment.txt first
echo $? # should be 0
AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-showmap -Q -o /dev/null -- ./harness_exercise corpus_exercise/attachment.txt
echo $? # should be 0
Step 7: Fuzz
First, if you didn’t create a dict/disposition.dict in Step 5, remove the appropriate line from
run_fuzz_exercise.sh and from the afl-fuzz command-line below.
./run_fuzz_exercise.sh
Or manually:
AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-fuzz -Q -c 0 \
-i corpus_exercise \
-o findings_exercise \
-x dict/disposition.dict \
-t 500 -m none -V 300 \
-- ./harness_exercise @@
Step 8: Analyze
After 5 minutes:
- What did you speed look like?
- How many corpus items did AFL find?
- Any crashes?
If you found crashes, reproduce them:
AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-showmap -Q -o /dev/null -- ./harness_exercise findings_exercise/default/crashes/id:000000,...
Also try reproducing them with GDB attached!
Going further
Feel free to apply what we learned about fuzzing Qdecode and mime_content_new_from_string in previous pages to enhance and optimize your fuzzer!
- Try fuzzing with QASan (
AFL_PATH=... AFL_USE_QASAN=1) - Apply persistent mode and
stdininput reading (see Part 5: Optimizations) to your exercise harness for a 2-5x speedup - Try fuzzing another
mailscannerfunction such asencodedWordToUtf(callsQdecodeandBdecode, so it’s a higher-level function) to see if you get more crashes