#bloom-filter #false-positives #numbers

rust-bloomfilter

A simple bloom filter implementation in Rust programming language

2 releases

1.0.0-beta1 Nov 10, 2020
0.0.4 Nov 25, 2019
0.0.3 Nov 21, 2019
0.0.2 Nov 21, 2019
0.0.1 Nov 21, 2019

#1679 in Data structures

MIT license

13KB
244 lines

rust-bloomfilter

Bloom filters are defined by 4 interdependent values:

  • n - Number of items in the filter
  • p - Probability of false positives, float between 0 and 1 or a number indicating 1-in-p
  • m - Number of bits in the filter
  • k - Number of hash functions

Guide for selecting the parameters

The values are interdependent as shown in the following calculations:

m = ceil((n * log(p)) / log(1.0 / (pow(2.0, log(2.0)))));

k = round(log(2.0) * m / n);

Design

I use murmur3 hash to generate 128 bit hash integer, and then i split it into two integers of 64 bits each. Following is the pseudo-code written for the design of bloom filter.

let hash_128 = murmur3_hash(data);
let first_64 = (hash_128 & (2_u128.pow(64) - 1));
let second_64 = hash >> 64;
for i 0..num_of_hashfuncs{
    first_64 += i* second_64;
    index =  fist_64 % number_of_bits
    self.bitvec.set(index, true);
}

Usage

extern crate rust_bloomfilter;

use rust_bloomfilter::BloomFilter;

let mut b = BloomFilter(20000, 0.01, true);
b.add("Helloworld");
assert!(b.contains("Helloworld"));

Dependencies

~2MB
~47K SLoC