Part 6: Exercise – Fuzz a Different Parser

In this exercise I suggest you target mime_disposition_new from mailscanner and go through the full process: reverse-engineer it, fill in the harness skeleton, add some starter seeds, fuzz it, and analyze the results.

You’re welcome to try a different function, but this one seems similar to the one we already fuzzed so it shouldn’t be as difficult to get started with.

Background

mime_disposition_new parses Content-Disposition headers such as:

attachment; filename="document.pdf"
inline; filename="image.png"

Step 1: Find the symbols

Before you begin, you should be in /home/cisco/guides/day-2/exercise and you should run ./setup_exercise.sh.

nm -D mailscanner.so | grep mime_disposition

Note down the parse (*_new) and destroy symbol names.

Step 2: Figure out the function signature

Read the disassembly of the function:

r2 -e asm.bytes=0 -a x86 -b 32 -qc 'aaa; s sym.mime_disposition_new; pdf' mailscanner.so

It’s also worth taking a look in Ghidra at the pseudocode/disasssembly for a clearer picture.

How many arguments? What types? How does it compare the the functions we’ve already looked at?

Also confirm there’s a _destroy function you’ll need to call after each iteration.

Step 3: Fill in the harness

A fuzzer skeleton is at harness_exercise.c with four TODOs:

  1. Function pointer type definitions (argument types)
  2. dlsym symbol names
  3. Call the parse function with the right arguments
  4. Call destroy on the result

Refer to ../harness/harness_ct.c for the pattern.

Step 4: Create seeds

Create some seeds in ./corpus_exercise/. At least have one named attachment.txt that uses the attachment syntax. If you really don’t feel like making a realistic seed, just echo whatever > corpus_exercise/attachment.txt

Step 5: Create a dictionary

Similar to Step 5, you can probably look up some examples of Content-Disposition headers online and get some ideas of tokens that would be good to include. If you found some, add each one like:

"some token here"
"another\x0d\x0atoken here"
"last token"

to the dict/disposition.dict file.

Step 6: Compile and test

./setup_exercise.sh        # You should have already run this
clang -m32 -g -O2 -o harness_exercise harness_exercise.c -ldl
./harness_exercise corpus_exercise/attachment.txt  # MAKE SURE you have made a corpus_exercise/attachment.txt first
echo $?                    # should be 0
AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-showmap -Q -o /dev/null -- ./harness_exercise corpus_exercise/attachment.txt
echo $?                    # should be 0

Step 7: Fuzz

First, if you didn’t create a dict/disposition.dict in Step 5, remove the appropriate line from run_fuzz_exercise.sh and from the afl-fuzz command-line below.

./run_fuzz_exercise.sh

Or manually:

AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-fuzz -Q -c 0 \
    -i corpus_exercise \
    -o findings_exercise \
    -x dict/disposition.dict \
    -t 500 -m none -V 300 \
    -- ./harness_exercise @@

Step 8: Analyze

After 5 minutes:

  • What did you speed look like?
  • How many corpus items did AFL find?
  • Any crashes?

If you found crashes, reproduce them:

AFL_QEMU_INST_RANGES=0x40001000-0x40002000,0x08048000-0x082da000 \
afl-showmap -Q -o /dev/null -- ./harness_exercise findings_exercise/default/crashes/id:000000,...

Also try reproducing them with GDB attached!

Going further

Feel free to apply what we learned about fuzzing Qdecode and mime_content_new_from_string in previous pages to enhance and optimize your fuzzer!

  • Try fuzzing with QASan (AFL_PATH=... AFL_USE_QASAN=1)
  • Apply persistent mode and stdin input reading (see Part 5: Optimizations) to your exercise harness for a 2-5x speedup
  • Try fuzzing another mailscanner function such as encodedWordToUtf (calls Qdecode and Bdecode, so it’s a higher-level function) to see if you get more crashes