#bloom-filter #cuckoo-filter #rsqf #quotient-filter #cqf

qfilter

Efficient bloom filter like datastructure, based on the Rank Select Quotient Filter (RSQF)

7 releases

0.1.6 Aug 2, 2023
0.1.5 Jul 16, 2023
0.1.4 Mar 12, 2023
0.1.3 Feb 12, 2023
0.1.1 Dec 4, 2022

#158 in Data structures

Download history 9159/week @ 2023-12-14 4094/week @ 2023-12-21 2552/week @ 2023-12-28 7166/week @ 2024-01-04 8115/week @ 2024-01-11 12165/week @ 2024-01-18 11307/week @ 2024-01-25 5143/week @ 2024-02-01 8167/week @ 2024-02-08 10205/week @ 2024-02-15 10477/week @ 2024-02-22 9571/week @ 2024-02-29 10768/week @ 2024-03-07 12691/week @ 2024-03-14 11711/week @ 2024-03-21 5559/week @ 2024-03-28

43,402 downloads per month

MIT license

69KB
1.5K SLoC

Qfilter

Efficient bloom filter like datastructure, based on the Rank Select Quotient Filter (RSQF).

This is a small and flexible general-purpose AMQ-Filter. It not only supports approximate membership testing like a bloom filter but also deletions, merging (not implemented), resizing and serde serialization.

  • High performance
  • Supports removals
  • Extremely compact, more so than comparable filters
  • Can be created with a initial small capacity and grow as needed
  • (De)Serializable with serde
  • Portable Rust implementation

Example

let mut f = qfilter::Filter::new(1000000, 0.01);
for i in 0..1000 {
    f.insert(i).unwrap();
}
for i in 0..1000 {
    assert!(f.contains(i));
}

Hasher

The hashing algorithm used is xxhash3 which offers both high performance and stability across platforms.

Filter size

For a given capacity and error probability the RSQF may require significantly less space than the equivalent bloom filter or other AMQ-Filters.

Bits per item Error probability when full
3.125 0.362
4.125 0.201
5.125 0.106
6.125 0.0547
7.125 0.0277
8.125 0.014
9.125 0.00701
10.125 0.00351
11.125 0.00176
12.125 0.000879
13.125 0.000439
14.125 0.00022
15.125 0.00011
16.125 5.49e-05
17.125 2.75e-05
18.125 1.37e-05
19.125 6.87e-06
20.125 3.43e-06
21.125 1.72e-06
22.125 8.58e-07
23.125 4.29e-07
24.125 2.15e-07
25.125 1.07e-07
26.125 5.36e-08
27.125 2.68e-08
28.125 1.34e-08
29.125 6.71e-09
30.125 3.35e-09
31.125 1.68e-09
32.125 8.38e-10

Not implemented

  • Merging
  • Shrink to fit
  • Counting
  • Smoother resizing by chaining exponentially larger and more precise filters

Legacy x86_64 CPUs support

The implementation assumes the popcnt instruction (equivalent to integer.count_ones()) is present when compiling for x86_64 targets. This is theoretically not guaranteed as the instruction is only available on AMD/Intel CPUs released after 2007/2008. If that's not the case the Filter constructor will panic.

Support for such legacy x86_64 CPUs can be optionally enabled with the legacy_x86_64_support which incurs a ~10% performance penalty.

License

This project is licensed under the MIT license.

Dependencies

~95–380KB