#id-generator #flake #ordered #identifier #generate #generated #128-bit

flaker

A flake implementation for Rust. Flaker generates 128-bit ordered IDs that can be generated across multiple machines/processes but can be counted on to be lexically sortable.

2 releases

Uses old Rust 2015

0.1.3 May 17, 2016
0.1.2 Mar 30, 2016

#12 in #flake

41 downloads per month

Apache-2.0/MIT

10KB
95 lines

flaker

A flake implementation for Rust

What is this?

A flake is a 128-bit, k-ordered ID - it's real time ordered and stored in a way that sorts lexically. Flaker is derived from Boundary's flake implementation and the author's previous work on Rustflakes - a similar tool for .NET.

Basically - it's an ordered ID generation service.

Rather than rely on a central data storage repository for generating IDs, developers can implement an ID generator at the source of data using a flaker.

Identifiers

Identifiers are generated as 128-bit numbers:

  • 64-bit timestamp as milliseconds since the dawn of time (January 1, 1970).
  • 48-bit worker identifier - typically this would be the MAC address, but you could use whatever you want.
  • 16-bit sequence number that is incremented when more than one identifier is requested in the same millisecond and reset to 0 when the clock moves forward.

License

Licensed under either of

at your option.

Questions

How should I use this?

Take a look at bulk_generate.rs in the examples.

In longer form:

  • Add the crate to Cargo.toml
  • Tell your code to use flaker with extern crate flaker
  • Create a new instance of flaker.
  • Call get_id() to get a brand new ID.'

Ideally, you'd use a central service to generate IDs - preferrably one per server instance.

When should I use flaker?

I mean, my database can generate IDs, right?

A centralized ID generator seems good until you have a large number of actors in your system generating IDs - think hundreds of servers. A large number of actors can overwhelm the ID generation capabilities of your central store. Or you may not care about gaps in the store and only care that you have time ordered, unique identifiers. Depending on the implementation of the underlying backing store, it also may not be possible to have the database generate sequential identifiers yourself (earlier versions of Azure SQL Database had this feature).

What should I use for the worker identifier?

I've been known to pull the MAC address of the first active ethernet adapter. It doesn't matter what you're using so long as it's guaranteed to be unique per generator. You could pull the last 6 bytes of the CPU identifier if that suited you.

While machine identity should be relatively meaningless in a distributed system, that doesn't mean we can't use an arbitrary indicator to achieve distinction between functioning nodes in a given time range. If you're afraid of MAC address spoofing, then you should be able to work something out.

6 bytes gives you a lot of room for creativity. I suggest arbitrarily incrementing a number that you store in an S3 bucket. You could regenerate your worker identifier 281,474,976,710,656 times before you run out of unique values.

But timezones!

flaker uses UTC when generating IDs. I don't trust you to set your server clocks to UTC, so I just took that leap for you.

Thanks

Thanks to:

  • serprex for the move to replace time with std::time.

Dependencies

~405KB