#bloom #bitmap

xorfilter-rs

Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters

6 releases (breaking)

0.5.1 Sep 8, 2021
0.5.0 Aug 29, 2021
0.4.0 Mar 13, 2021
0.3.0 Mar 10, 2021
0.1.0 Mar 18, 2020

#737 in Data structures

Download history 685/week @ 2024-07-19 595/week @ 2024-07-26 684/week @ 2024-08-02 569/week @ 2024-08-09 442/week @ 2024-08-16 479/week @ 2024-08-23 488/week @ 2024-08-30 603/week @ 2024-09-06 599/week @ 2024-09-13 635/week @ 2024-09-20 564/week @ 2024-09-27 672/week @ 2024-10-04 593/week @ 2024-10-11 583/week @ 2024-10-18 505/week @ 2024-10-25 476/week @ 2024-11-01

2,318 downloads per month
Used in 9 crates (3 directly)

Apache-2.0

83KB
2K SLoC

Rustdoc simple-build-test

Rust library implementing xor filters

Implementation of Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters in rust-lang, Journal of Experimental Algorithmics (to appear).

This package is a port from its golang implementation.

How to use xorfilter in my rust project ?

Add the following under project's Cargo.toml:

[dependencies]
xorfilter-rs = "0.2.0"

or

[dependencies]
xorfilter-rs = { git = "https://github.com/bnclabs/xorfilter" }
use xorfilter::Xor8;

let mut keys: Vec<u64> = vec![];
for _ in 0..num_keys {
    keys.push(rng.gen());
}

let mut filter = Xor8::new(); // new filter.
filter.populate_keys(&keys); // populate keys.
filter.build(); // build bitmap.

for key in 0..lookup {
    // there can be false positives, but no false negatives.
    filter.contains_key(key);
}

Open issues

  • Serialize / Deserialize Xor8 type.
  • Incrementally adding keys to a pre-built Xor8 instance.
  • Gather benchmark results for other implementations - Go, C, C++, Erlang, Java, Python.

Benchmarks

Following are the results for a set of 10-million u64 keys:

build 10M keys membership FPP Bits/Entry
Xor8-C 1.206 secs NA 0.389 % 9.84 bits
Xor8-rust 1.809 secs 61.716 ns 0.392 % 9.84 bits
Fuse8-C 0.508 secs NA 0.390 % 9.02 bits
Fuse8-rust 0.577 secs 42.657 ns 0.392 % 9.02 bits
Fuse16-C 0.515 secs NA 0.001 % 18.04 bits
Fuse16-rust 0.621 secs 54.657 ns 0.001 % 18.03 bits
  • Build time is measured in Seconds, for 10 million entries.
  • Membership is measured in Nanosec, for single lookup in a set of 10 million entries.
  • FPP = False Positive Probability measured in percentage

Contribution

  • Simple workflow. Fork - Modify - Pull request.
  • Before creating a PR,
    • Run make build to confirm all versions of build is passing with 0 warnings and 0 errors.
    • Run check.sh with 0 warnings, 0 errors and all test-cases passing.
    • Run perf.sh with 0 warnings, 0 errors and all test-cases passing.
    • Install and run cargo spellcheck to remove common spelling mistakes.
  • Developer certificate of origin is preferred.

No runtime deps