#id-generator #snowflake-id #id #distributed-id #snowflake #distributed-systems #generator

sinteflake

A 64 bits ID generator inspired by Snowflake, but generating very distinct numbers

1 unstable release

0.1.0 Aug 6, 2024

#10 in #snowflake-id

Apache-2.0

41KB
664 lines

SINTEFlake

SINTEFlake is a 64 bits ID generator, inspired by Twitter's Snowflake and Sony's Sonyflake.

It generates identifiers that start with a hash or a pseudo-random number instead of a timestamp. Identifiers are not roughly time-ordered but are very distinct numbers.

Features

  • Generates 64-bit IDs with distinct values
  • Allows custom instance IDs for distributed systems
  • Provides hash-based ID generation
  • Supports both synchronous and asynchronous environments

Structure

A SINTEFlake ID is composed of:

  • 14 bits for a hash or a random number.
  • 31 bits for a timestamp with a 8 seconds resolution.
  • 10 bits for an instance identifier.
  • 8 bits for a sequence number.

That adds up to 63 bits, to have only positive numbers when using signed 64 bits integers.

Installation

Add this to your Cargo.toml:

[dependencies]
sinteflake = "0.1"

Usage

use sinteflake::{next_id, next_id_with_hash, set_instance_id, update_time};

set_instance_id(42)?;

let id_a = next_id()?;
let id_b = next_id()?;

let id_c = next_id_with_hash(&[1, 2, 3])?;
let id_d = next_id_with_hash(&[1, 2, 3])?;

update_time()?;

Async Usage:

[dependencies]
sinteflake = { version = "0.1", features = ["async"] }
tokio = { version = "1", features = ["full"] }
use sinteflake::{next_id_async, next_id_with_hash_async, set_instance_id_async, update_time_async};

set_instance_id_async(42).await?;

let id = next_id_async().await?;
let id = next_id_with_hash_async(&[1, 2, 3]).await?;

update_time_async().await?;

Please note that the async feature is not enabled by default, and that set_instance_id_async is not setting the instance ID of the non async version.

Custom Settings

You can create a custom SINTEFlake instance with your own settings:

use sinteflake::sinteflake::SINTEFlake;
use time::OffsetDateTime;

let mut instance = SINTEFlake::custom(
    42,                                                      // instance_id
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], // hash key
    123,                                                     // counter hash key
    OffsetDateTime::from_unix_timestamp(1719792000)?,        // epoch
)?;

let id_a = instance.next_id()?;
// ...

Not Time Ordered

Unlike Snowflake (and Sonyflake), SINTEFlake does not intend to be ordered roughly in time. A sequence of IDs generated by SINTEFlake will have very different values. This can be useful for working with zone maps in vertical databases, for example.

The timestamp precision is only 8 seconds. Moreover, permutations of the timestamp bits prevent the numbers from being stable. So, using the identifier for ordering is not possible. It will overflow after about 544 years, which should be long enough.

This design choice involves slightly higher memory usage and complexity compared to Snowflake, as more numbers need to be tracked for collisions. Not being roughly time-ordered is also a disadvantage in many cases.

This is not CryptoSecure

You can't be cryptographically secure with only 64 bits. SINTEFLake identifiers are not safe on their own because they are not long enough and can easily be brute-forced.

For reference, the hashing algorithm is SIPHash 2-4. The timestamp permutation table is using digits of π and e which should be nothing-up-my-sleeve enough.

Consider using UUIDs

UUIDs are great but somewhat big. Sometimes, you prefer to work with 64 bits instead of 128 bits. This can be useful for making small performance improvements or for working with systems that do not natively support 128-bit numbers. 64-bit numbers are often computed much faster than strings or byte arrays.

However, UUIDs are almost always a better choice and should be preferred.

Testing

cargo test
cargo llvm-cov # Coverage report
cargo bench # Benchmark
cargo bench --bench=bench -- --quick # Quick benchmark

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Dependencies

~1.9–8MB
~76K SLoC