1 unstable release
0.1.0 | Feb 10, 2022 |
---|
#491 in #machine-learning
26KB
587 lines
logreduce-tokenizer
Build:
python setup.py build
export PYTHONPATH=$(pwd)/build/lib
Bench:
python benches/bench.py
Test perf:
RUSTFLAGS="-C target-cpu=native" cargo build --release
Build CLI:
RUSTFLAGS="-C target-cpu=native" cargo build --example logreduce-tokenizer-cli --release
lib.rs
:
This library provides a tokenizer function for the logreduce project.
The goal is to replace varying words with fixed tokens (e.g. sha256://...
is converted to %HASH
).
Dependencies
~5.5MB
~100K SLoC