3 releases
0.1.2 | May 28, 2020 |
---|---|
0.1.1 | Apr 5, 2019 |
0.1.0 | Apr 3, 2019 |
#2523 in Data structures
11KB
92 lines
flit
BloomFilter
is a probabilistic data structure that can quickly and definitively conclude that it does
not contain an item. On the other hand, it can only conclude that it probably contains an
item, i.e., the data structure has an inherent false-positive rate greater than 0%.
Items can be added to the Bloom filter, but cannot be removed - this would introduce false negative cases. If this is required, an alternative might be to use a Counting Bloom filter (not (yet) implemented in this crate).
This implementation is backed by a Rust implementation of the xxHash hashing algorithm, twox-hash.
References
- Less Hashing, Same Performance: Building a Better Bloom Filter
- Wikipedia article
- Bloom Filters by Example
- Bloom Filter Calculator
Example
use flit::BloomFilter;
let mut filter = BloomFilter::new(0.01, 10000);
filter.add(&"Hello, world!");
assert_eq!(filter.might_contain(&"Hello, world!"), true); // probably true
assert_eq!(filter.might_contain(&"Dogs are cool!"), false); // definitely false!
Benchmarks
Benchmarking is done using criterion
.
Simple benchmarks are provided for adding 100, 1000 and 10000 u32
s to a BloomFilter. To run the benchmarks, run:
cargo bench
add 100 time: [22.454 us 22.477 us 22.507 us]
add 1000 time: [224.44 us 224.65 us 224.90 us]
add 10000 time: [2.2424 ms 2.2443 ms 2.2463 ms]
Dependencies
~1MB
~24K SLoC