#consistent-hashing #consistent #hash-key #hash #sharding #routing #balancer

anchorhash

A consistent hashing algorithm that outperforms state-of-the-art algorithms

4 releases

0.2.2 Oct 29, 2022
0.2.1 Jun 20, 2021
0.2.0 Mar 14, 2021
0.1.0 Mar 12, 2021

#1564 in Algorithms

Download history 10/week @ 2023-12-18 4/week @ 2024-01-01 1/week @ 2024-01-15 3/week @ 2024-01-29 24/week @ 2024-02-19 24/week @ 2024-02-26 68/week @ 2024-03-04 57/week @ 2024-03-11 65/week @ 2024-03-18 8/week @ 2024-03-25 133/week @ 2024-04-01

268 downloads per month
Used in scaled_storage

Apache-2.0

42KB
592 lines

crates.io docs.rs

AnchorHash

A consistent hashing algorithm described in AnchorHash: A Scalable Consistent Hash.

Consistent hashing (CH) is a central building block in many networking applications, from datacenter load-balancing to distributed storage. Unfortunately, state-of-the-art CH solutions cannot ensure full consistency under arbitrary changes and/or cannot scale while maintaining reasonable memory footprints and update times. We present AnchorHash, a scalable and fully-consistent hashing algorithm. AnchorHash achieves high key lookup rates, a low memory footprint, and low update times. We formally establish its strong theoretical guarantees, and present advanced implementations with a memory footprint of only a few bytes per resource. Moreover, extensive evaluations indicate that it outperforms state-of-the-art algorithms, and that it can scale on a single core to 100 million resources while still achieving a key lookup rate of more than 15 million keys per second.

Key points:

  • Uniform balancing of load across all resources
  • Optimal rebalancing of keys when resources are added & removed
  • Small memory footprint
  • It's really, really fast!

AnchorHash consistently hashes keys onto resources under arbitrary working set changes. It does this with a low memory footprint, fast key lookups (10s to 100s of millions of lookups per second), optimal disruption and uniform balancing of load across resources.

Optimisations

This implementation makes use of SSE4.2 instructions by default on x86_64 platforms to quickly perform internal bucket hashing - the Fowler–Noll–Vo hash is used as a fallback. The SIMD optimised hash can be manually disabled by opting out of the simd crate feature.

This implementation also makes use of Daniel Lemire's fast range mapping algorithm presented in Fast Random Integer Generation in an Interval when compiled on 64-bit architectures. This can be manually disabled by opting out of the fastmod crate feature.

This implementation uses 16-bit integers to maximise cache locality, providing a significant speed up for small capacity instances. This limits the total number of addressable resources to 65,535.

Benchmarks

Benchmarks that cover the hash algorithms, range mapping optimisations and overall AnchorHash implementation are included in this crate - cargo bench runs them.

Dependencies

~1.1–1.8MB
~33K SLoC