#bloom #bitmap #data-structures #xor-filter

xorfilter-rs

Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters

6 releases (breaking)

0.5.1 Sep 8, 2021
0.5.0 Aug 29, 2021
0.4.0 Mar 13, 2021
0.3.0 Mar 10, 2021
0.1.0 Mar 18, 2020

#457 in Algorithms

Download history 1132/week @ 2023-12-18 521/week @ 2023-12-25 713/week @ 2024-01-01 788/week @ 2024-01-08 534/week @ 2024-01-15 577/week @ 2024-01-22 539/week @ 2024-01-29 458/week @ 2024-02-05 525/week @ 2024-02-12 491/week @ 2024-02-19 513/week @ 2024-02-26 643/week @ 2024-03-04 551/week @ 2024-03-11 563/week @ 2024-03-18 663/week @ 2024-03-25 1232/week @ 2024-04-01

3,074 downloads per month
Used in 8 crates (3 directly)

Apache-2.0

83KB
2K SLoC

Rustdoc simple-build-test

Rust library implementing xor filters

Implementation of Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters in rust-lang, Journal of Experimental Algorithmics (to appear).

This package is a port from its golang implementation.

How to use xorfilter in my rust project ?

Add the following under project's Cargo.toml:

[dependencies]
xorfilter-rs = "0.2.0"

or

[dependencies]
xorfilter-rs = { git = "https://github.com/bnclabs/xorfilter" }
use xorfilter::Xor8;

let mut keys: Vec<u64> = vec![];
for _ in 0..num_keys {
    keys.push(rng.gen());
}

let mut filter = Xor8::new(); // new filter.
filter.populate_keys(&keys); // populate keys.
filter.build(); // build bitmap.

for key in 0..lookup {
    // there can be false positives, but no false negatives.
    filter.contains_key(key);
}

Open issues

  • Serialize / Deserialize Xor8 type.
  • Incrementally adding keys to a pre-built Xor8 instance.
  • Gather benchmark results for other implementations - Go, C, C++, Erlang, Java, Python.

Benchmarks

Following are the results for a set of 10-million u64 keys:

build 10M keys membership FPP Bits/Entry
Xor8-C 1.206 secs NA 0.389 % 9.84 bits
Xor8-rust 1.809 secs 61.716 ns 0.392 % 9.84 bits
Fuse8-C 0.508 secs NA 0.390 % 9.02 bits
Fuse8-rust 0.577 secs 42.657 ns 0.392 % 9.02 bits
Fuse16-C 0.515 secs NA 0.001 % 18.04 bits
Fuse16-rust 0.621 secs 54.657 ns 0.001 % 18.03 bits
  • Build time is measured in Seconds, for 10 million entries.
  • Membership is measured in Nanosec, for single lookup in a set of 10 million entries.
  • FPP = False Positive Probability measured in percentage

Contribution

  • Simple workflow. Fork - Modify - Pull request.
  • Before creating a PR,
    • Run make build to confirm all versions of build is passing with 0 warnings and 0 errors.
    • Run check.sh with 0 warnings, 0 errors and all test-cases passing.
    • Run perf.sh with 0 warnings, 0 errors and all test-cases passing.
    • Install and run cargo spellcheck to remove common spelling mistakes.
  • Developer certificate of origin is preferred.

No runtime deps