8 releases
0.2.4 | Jul 19, 2024 |
---|---|
0.2.3 | Jun 21, 2024 |
0.2.2 | May 24, 2024 |
0.2.1 | Apr 13, 2024 |
0.1.2 | Dec 26, 2023 |
#255 in Testing
97KB
2K
SLoC
This project is a reimplementation of Gramatron that is
- performant: 4x higher throughput than Gramatron or LibAFL's Gramatron implementation
- versatile: usable with LibAFL, libfuzzer, in a custom AFL++ mutator or standalone
- easy to use: no more orchestration of different scripts to get the fuzzing campaign running, everything is batteries-included
- extendable: at its core, peacock is a library that you can use at your leisure to customize every step of the grammar fuzzing process
- backwards compatible: it works with grammars that you have already written for other tools
How to use it
Clone the repo and execute
cargo build --release
This creates 5 ready-to-use tools:
peacock-fuzz
: A coverage-guided fuzzer that can fuzz any binary compiled with AFL++'s compilers or anything that speaks AFL's forkserver protocolpeacock-dump
: peacock-fuzz saves crashes and queue items in a raw, binary format to disk. Use this tool to get a human readable output from any such file. All these binary files have the prefixpeacock-raw-
peacock-compile
: Takes a grammar and compiles it to C codepeacock-merge
: Merge multiple grammar files into one or convert a grammar file from one format into anotherpeacock-gen
: Generate individual inputs from a grammar
If you want more fine-grained control you can use the crate peacock_fuzz
, which is the backbone of all the tools from above.
See the documentation at docs.rs in order to get started with peacock as a library.
How it works
Peacock is a fuzzer that implements so-called "grammar-based mutations". This means that it will mutate its inputs in such a way that they will always adhere to a given grammar.
The way mutations work is the same as in Gramatron. A grammar is converted to a PDA such that an input can be represented as a walk through the automaton. Then, a mutation of an input is simply a modification of an automaton walk. We cut off the walk at a random point and let it find a new random path through the automaton from there.
While Gramatron and LibAFL realize the automaton as an adjacency matrix, peacock generates C code that encodes the automaton in its control flow. This saves us a lot of memory accesses and makes the mutation procedure faster.
The generated C code exposes a certain API that can be used by any application, e.g. a libfuzzer harness, an AFL++ custom mutator or even Rust code.
Peacock also ships a ready to use fuzzer that can fuzz any binary that has been compiled with AFL++'s compilers or implements an AFL-style forkserver.
How to write grammars
Peacock accepts its context-free grammars in JSON format. A context-free grammar has production rules of the form:
A -> X Y Z ...
where A
must be a non-terminal and X
,Y
,Z
can be non-terminals or terminals. The right-hand-side must contain at least one symbol.
Non-terminals are enclosed in <>
, so the non-terminal A
would be represented as <A>
. Terminals are enclosed in ''
.
The set of rules
A -> a B
A -> a
B -> b B
B -> Ɛ
would be written as
{
// Comments are also possible :)
"<A>": [
["'a'", "<B>"],
["'a'"]
],
"<B>": [
["'b'", "<B>"],
["''"] // Ɛ = ''
]
}
and corresponds to the regular expression a(b*)
.
Peacock also supports the Gramatron format, which is a bit different and does not allow for comments.
The non-terminal <ENTRYPOINT>
is the entrypoint of the grammar.
C API Documentation
-
void seed_generator (size_t new_seed)
Supply a seed for the RNG of the mutator. -
size_t unparse_sequence (size_t* seq_buf, size_t seq_capacity, unsigned char* input, size_t input_len)
Given an input that adheres to the grammar, find the corresponding automaton walk. This function may be slow, use outside of hot loop.seq_buf
: Automaton walk will be written into this bufferseq_capacity
: Maximum number of elements thatseq_buf
can hold (not number of bytes)input
: User input adhering to grammarinput_len
: Length ofinput
Returns the number of elements written to
seq_buf
or 0 if input does not adhere to grammar. -
size_t mutate_sequence (size_t* buf, size_t len, size_t capacity)
Given an automaton walk, create a random mutant of the walk.buf
: Pointer to array that holds automaton walklen
: Number of items inbuf
(not number of bytes)capacity
: Maximum number of items thatbuf
can hold (not number of bytes)
Returns the length of the new walk.
-
size_t serialize_sequence (size_t* seq, size_t seq_len, unsigned char* out, size_t out_len)
Given an automaton walk, create the corresponding output.seq
: Pointer to automaton walkseq_len
: Number of items inseq
(not number of bytes)out
: Output will be written into that bufferout_len
: Number of bytes inout
Returns how many bytes have been written to
out
.
Macros:
MAKE_THREAD_SAFE
: Define this to make the mutator completely thread-safeMAKE_VISIBLE
: Define this to explicitly set the visibility of the functions from above to "default"STATIC_SEED=<your seed>
: Compile-time seed for the RNGDISABLE_rand
: Don't include the internalrand
function and use an external one with the signaturesize_t rand (void)
DISABLE_seed_generator
: Don't include the functionseed_generator
Dependencies
~13–41MB
~578K SLoC