### 8 releases

0.2.1 | Aug 20, 2024 |
---|---|

0.2.0 | Mar 8, 2024 |

0.1.5 | Sep 28, 2023 |

0.1.4 | May 22, 2023 |

#**390** in Data structures

**19,258** downloads per month

Used in **6** crates
(via foyer-memory)

**Apache-2.0**

19KB

305 lines

# cmsketch

A count min sketch implementation in Rust.

Inspired by Facebook/CacheLib and Caffeine.

## Usage

`use` `cmsketch``::`CMSketchU32`;`
`const` `ERROR``:` `f64` `=` `0.``01``;`
`const` `CONFIDENCE``:` `f64` `=` `0.``95``;`
`fn` `main``(``)`` ``{`
`let` `mut` cms `=` `CMSketchU32``::`new`(``ERROR``,` `CONFIDENCE``)``;`
`for` i `in` `0``..``10` `{`
`for` `_` `in` `0``..`i `{`
cms`.``inc``(`i`)``;`
`}`
`}`
`for` i `in` `0``..``10` `{`
`assert!``(`cms`.``estimate``(`i`)` `>=` i `as` `u32``)``;`
`}`
cms`.``halve``(``)``;`
`for` i `in` `0``..``10` `{`
`assert!``(`cms`.``estimate``(`i`)` `>=` `(`i `as` `f64` `*` `0.``5``)` `as` `u32``)``;`
`}`
`}`

## Roadmap

- simd halve
- benchmark

###
`lib.rs`

:

A probabilistic counting data structure that never undercounts items before it hits counter's capacity. It is a table structure with the depth being the number of hashes and the width being the number of unique items. When a key is inserted, each row's hash function is used to generate the index for that row. Then the element's count at that index is incremented. As a result, one key being inserted will increment different indices in each row. Querying the count returns the minimum values of these elements since some hashes might collide.

Users are supposed to synchronize concurrent accesses to the data structure.

E.g. insert(1) hash1(1) = 2 -> increment row 1, index 2 hash2(1) = 5 -> increment row 2, index 5 hash3(1) = 3 -> increment row 3, index 3 etc.

# Usage

`use` `cmsketch``::`CMSketchU32`;`
`const` `ERROR``:` `f64` `=` `0.``01``;`
`const` `CONFIDENCE``:` `f64` `=` `0.``95``;`
`fn` `main``(``)`` ``{`
`let` `mut` cms `=` `CMSketchU32``::`new`(``ERROR``,` `CONFIDENCE``)``;`
`for` i `in` `0``..``10` `{`
`for` `_` `in` `0``..`i `{`
cms`.``inc``(`i`)``;`
`}`
`}`
`for` i `in` `0``..``10` `{`
`assert!``(`cms`.``estimate``(`i`)` `>=` i `as` `u32``)``;`
`}`
cms`.``halve``(``)``;`
`for` i `in` `0``..``10` `{`
`assert!``(`cms`.``estimate``(`i`)` `>=` `(`i `as` `f64` `*` `0.``5``)` `as` `u32``)``;`
`}`
`}`