#hash-table #hashing #algorithm #key #deterministic #non-cryptographic #bit

no-std zwohash

A fast, deterministic, non-cryptographic hash for use in hash tables

3 releases

0.1.2 Sep 4, 2020
0.1.1 Aug 30, 2020
0.1.0 Aug 17, 2020

#640 in Algorithms

Download history 31725/week @ 2023-11-20 15748/week @ 2023-11-27 19767/week @ 2023-12-04 22130/week @ 2023-12-11 17422/week @ 2023-12-18 12095/week @ 2023-12-25 13957/week @ 2024-01-01 39208/week @ 2024-01-08 42151/week @ 2024-01-15 25058/week @ 2024-01-22 34441/week @ 2024-01-29 29668/week @ 2024-02-05 33165/week @ 2024-02-12 46262/week @ 2024-02-19 41834/week @ 2024-02-26 57278/week @ 2024-03-04

178,594 downloads per month
Used in 3 crates

0BSD license

22KB
232 lines

ZwoHash

ci github crates.io docs.rs

ZwoHash implements a very fast hash algorithm optimized for the use in hash tables. It has low per-hash overhead, which is important when hashing small keys. It is non-cryptographic and deterministic and as such not suited for inserting untrusted user-provided input into hash tables, unless other denial of service countermeasures are taken. As such it covers the same use cases as rustc's FxHash.

Compared to FxHash, ZwoHash provides essentially the same hashing speed while aiming for more uniform outputs. When used in a hash table ZwoHash is almost always able to match the performance of FxHash while outperforming it by quite a bit for some common inputs for which FxHash's output is particularly poor.

The hash algorithm used by ZwoHash is very similar to that of FxHash, both process one usize at a time and perform the same number and kind of operations per usize. ZwoHash though, replaces the last iteration with a slightly more expensive operation that provides better output guarantees. The additional overhead (independent of the size of the hashed data) consists of performing a wide multiplication instead of a truncated multiplication and one additional subtraction. This is very little overhead, and almost doesn't register when using ZwoHash in a hash table.

ZwoHash guarantees that any input bit can affect any bit of the output. FxHash does not guarantee this, and even beyond that, ZwoHash's output is more uniform. When used in a hash table, this often reduces the number of collisions and thus the number of required probes for each access. This can result in ZwoHash outperforming FxHash in that setting.

Sometimes, given inputs for which FxHash is especially ill-suited, ZwoHash outperforms FxHash by a large margin. This includes integer keys that all are a multiple of a power of two, floating point values with a short base-2 representation, pointers returned from the allocator and other inputs that only differ in the higher bits of the last processed usize.

Usage

If the std feature (enabled by default) is used this crate also exports the type aliases HashMap and HashSet which are re-exports of std::collection with the hashing algorithm set to ZwoHash. See their respective documentation for how to use them.

This crate always exports the ZwoHasher type which implements the std/core traits Hasher and Default, see core::hash for how to use this within Rust's hashing framework.

Benchmarks

ZwoHash comes with set of criterion benchmarks that test it against FxHash. You can run them on your machine using cargo bench. This takes several minutes.

Feedback

The above claims are based on the limited benchmarking I performed so far. Should you decide to give ZwoHash a try, I would be very much interested in hearing back from you. I'm especially interested in real-world benchmarks where ZwoHash is outperformed by FxHash, but I'd also love to hear where ZwoHash improves performance. Feel free to file issues for this.

no_std

ZwoHash can be used from no_std code by disabling the default std feature of this crate.

License

This software is available under the Zero-Clause BSD license, see COPYRIGHT for full licensing information and exceptions to this.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this software by you shall be licensed as defined in COPYRIGHT.

No runtime deps