5 releases
0.1.10-75501d4 | Feb 22, 2020 |
---|---|
0.1.9-bb1f021 | Feb 3, 2020 |
0.1.7-35bfaa1 | Feb 1, 2020 |
0.1.6-a180978 | Jan 5, 2020 |
0.1.5-6e5ec74 | Jan 5, 2020 |
#2101 in Parser implementations
32KB
566 lines
Fakelogs
Fakelogs is a random log generator. It can be used for load testing of log parsers.
It is written in Rust and is mostly a toy project to ramp up on the language. It might however be useful. Use at your own risk.
Status
Current version is 0.1.10.
Install
No install target yet, copy the fakelogs
binary in your $PATH
if you wish, that's all.
A few commands which may prove useful:
cargo build # build debug binary in ./target/debug/
cargo build --release # build release binary in ./target/release/
cargo test # launch tests
rustfmt src/*.rs # format code
./docker-build.sh # build Docker image with version tag
./bump-version.sh # bump minor version number
Usage
Simply launch:
cargo run
Or just run the binary directly:
./target/debug/fakelogs
./target/release/fakelogs
Alternatively, using docker:
docker run ufoot/fakelogs
To pass options:
cargo run -- --csv -100
By default, the generated lines follow the Apache common line format, so look like:
127.0.0.1 - james [09/May/2018:16:00:39 +0000] "GET /report HTTP/1.0" 200 123
127.0.0.1 - jill [09/May/2018:16:00:41 +0000] "GET /api/user HTTP/1.0" 200 234
127.0.0.1 - frank [09/May/2018:16:00:42 +0000] "POST /api/user HTTP/1.0" 200 34
127.0.0.1 - mary [09/May/2018:16:00:42 +0000] "POST /api/user HTTP/1.0" 503 12
There's a -c
or --csv
option, if you call fakelogs -c
you get an alternate custom CSV format:
"10.0.0.4","-","apache",1549573860,"GET /api/user HTTP/1.0",200,1234
If you pass an integer after a dash, it defines the average number of lines per second. The default is 1000. Maximum is 1000000. Eg to change the output to 10000 lines per second:
fakelogs -10000
Other standard options include:
-h
,--help
: display a short help.-v
,--version
: display version.--no-high-card
: disable high cardinality, the random 4 letters sections are replaced byxxxx
--no-time-skew
: disable time skewing, all logs look, on an average, as if they are just from now, and not 30 minutes old.--no-time-jitter
: disable time jittering, all logs have strict increasing time.--no-header
: skip the header line--no-junk
: no random junk lines--no-burst
: no random burst behavior, allows output at a constant rate
Logs content
The logs may look random, but they follow a few patterns:
- IPs are chosen in a constant, finite list
- users are chosen in a constant, finite list
- HTTP codes are distributed with:
- 50% of 2XXs
- 25% of 3XXs
- 20% of 4XXs
- 5% of 5XXs
- request methods are distributed with:
- 60% of GETs
- 20% of POSTs
- 20% of HEADs
- the URLs are of the form
/section/XXXX-file.ext
or/XXXX/file.txt
withXXXX
being totally random where section can be:- 50% of
yolo
(eg:/yolo/wE5d-index.html
) - 15% of
foo/bar
- 15% of
bar/foo
- 15% of "no section" (so URL of the form
/w3QL/secret.txt
) - 5% of
pizzapino
- 50% of
- size is uniformly distributed between 100 bytes and 19,9k (average is 10k).
- generally, timestamps are generated to match the generation time, minus 30 minutes, so log appear, on an average, to be from half an hour ago.
- but... 10% of the time timestamp is shifted in the past or in the future, by up to 2 minutes, with an average of 1 minute. This means timestamps are not increasing, order is not respected.
- every 5 seconds, the rate may changes, it can either be just one line per second (slow output) or 2500 lines per second (fast output). The ratio is:
- 40% of fast output
- 60% of slow output
- on an average (including the slow output) the throughput should be slightly above 1000 lines per second.
- when the default output of 1000 lines per second is changed, all numbers above are scaled, but the slow output is always one line per second.
- once out of 1000, an invalid line containing
Your attention please, this is a hack!
pops out.
License
Fakelogs is licensed under the MIT license.
Dependencies
~1.5MB
~28K SLoC