9 releases (stable)

1.6.18033988 Apr 9, 2024
1.6.1803398 Jan 14, 2024
1.6.180339 Dec 29, 2023
1.6.1803 Sep 3, 2023
0.0.1 Jul 26, 2023

#180 in Database interfaces

23 downloads per month

Custom license

40KB
595 lines

Bottom line up front

Julids are globally unique, sortable identifiers, that are backwards-compatible with ULIDs. This crate provides a Rust Julid datatype, as well as a loadable extension for SQLite for creating and querying them:

$ sqlite3
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load ./libjulid
sqlite> select hex(julid_new());
018998768ACF000060B31DB175E0C5F9
sqlite> select julid_string(julid_new());
01H6C7D9CT00009TF3EXXJHX4Y
sqlite> select julid_seconds(julid_new());
1690480066.208
sqlite> select datetime(julid_timestamp(julid_new()), 'auto');
2023-07-27 17:47:50
sqlite> select julid_counter(julid_new());
0
sqlite> select julid_string();
01HM4WJ7T90001P8SN9898FBTN

Crates.io: https://crates.io/crates/julid-rs

Docs.rs: https://docs.rs/julid-rs/latest/julid/

Blog post: https://proclamations.nebcorp-hias.com/sundries/presenting-julids/

A slightly deeper look

Julids are a drop-in replacement for ULIDs: all Julids are valid ULIDs, but not all ULIDs are valid Julids.

Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed they do:

  • they are 128-bits long
  • they are lexicographically sortable
  • they encode their creation time as the number of milliseconds since the UNIX epoch in their top 48 bits
  • their string representation is a 26-character base-32 Crockford encoding of their big-endian bytes
  • IDs created within the same millisecond are still meant to sort in their order of creation

Julids and ULIDs have different ways to implement that last piece. If you look at the layout of bits in a ULID, you see:

ULID bit structure

According to the ULID spec, for ULIDs created in the same millisecond, the least-significant bit should be incremented for each new ID. Since that portion of the ULID is random, that means you may not be able to increment it without spilling into the timestamp portion. Likewise, it's easy to guess a new possibly-valid ULID simply by incrementing an already-known one. And finally, this means that sorting will need to read all the way to the end of the ULID for IDs created in the same millisecond.

To address these shortcomings, Julids (Joe's ULIDs) have the following structure:

Julid bit structure

As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16 most-significant bits are not random, they're a monotonic counter for IDs created within the same millisecond. Since it's only 16 bits, it will saturate after 65,536 IDs intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic total order (the random bits will still be different, so you shouldn't have collisions). My PC, which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an issue! Because the random bits are always fresh, it's not possible to easily guess a valid Julid if you already know one.

How to use

The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust project's Cargo.toml file (say, by running cargo add julid-rs), and used as shown in the sample commandline program (see below). But the primary use case for me was as a loadable SQLite extension. Both are covered in the documentation, but let's go over them here, starting with the extension.

Inside SQLite as a loadable extension

The extension, when loaded into SQLite, provides the following functions:

  • julid_new(): create a new Julid and return it as a 16-byte blob
  • julid_string(): create a new Julid and return it as a 26-character base-32 Crockford-encoded string
  • julid_seconds(julid): get the number seconds (as a 64-bit float) since the UNIX epoch that this julid was created (convenient for passing to the builtin datetime() function)
  • julid_counter(julid): show the value of this julid's monotonic counter
  • julid_sortable(julid): return the 64-bit concatenation of the timestamp and counter
  • julid_string(julid): show the base-32 Crockford encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in hex() function to select a human-readable representation

Building and loading

If you want to use it as a SQLite extension:

  • clone the repo
  • build it with cargo build --features plugin (this builds the SQLite extension)
  • copy the resulting libjulid.[so|dylib|whatevs] to some place where you can...
  • load it into SQLite with .load /path/to/libjulid as shown at the top
  • party

If you, like me, wish to use Julids as primary keys, just create your table like:

create table users (
  id blob not null primary key default (julid_new()),
  ...
);

and you've got a first-class ticket straight to Julid City, baby!

For a table created like:

-- table of things to watch
create table if not exists watches (
  id blob not null primary key default (julid_new()),
  kind int not null, -- enum for movie or tv show or whatev
  title text not null,
  length int,
  release_date date,
  added_by blob not null,
  last_updated date not null default CURRENT_TIMESTAMP,
  foreign key (added_by) references users (id)
);

and then some code that inserted rows into that table like

insert into watches (kind, title, length, release_date, added_by) values (?,?,?,?,?)

where the wildcards get bound in a loop with unique values and the Julid id field is generated by the extension for each row, I get over 100,000 insertions/second when using a file-backed DB in WAL mode and NORMAL durability settings.

Safety

There is one unsafe fn in this project, sqlite_julid_init(), and it is only built for the plugin feature. The reason for it is that it's interacting with foreign code (SQLite itself) via the C interface, which is inherently unsafe. If you are not building the plugin, there is no unsafe code.

Inside a Rust program

Of course, you can also use it outside of a database; the Julid type is publicly exported. There's a simple commandline program in src/bin/gen.rs, and can be run like cargo run --bin julid-gen (or you can cargo install julid-rs to get the julid-gen program installed on your computer), which will generate and print one Julid. If you want to see its component pieces, grab the Julid printed from it, and then run it with the -d flag:

$ julid-gen 4
01HV2G2ATR000CJ2WESB7CVC19
01HV2G2ATR000K1AGQPKMX5H0M
01HV2G2ATR001CM27S59BHZ25G
01HV2G2ATR001WPJ8BS7PZHE6A
$ julid-gen -d 01HV2G2ATR001WPJ8BS7PZHE6A
Created at:		2024-04-09 22:36:11.992 UTC
Monotonic counter:	3
Random:			14648252224908081354

The help is useful:

$ julid-gen -h
Generate, print, and parse Julids

Usage: julid-gen [OPTIONS] [NUM]

Arguments:
  [NUM]  Number of Julids to generate [default: 1]

Options:
  -d, --decode <INPUT>  Print the components of the given Julid
  -a, --answer          The answer to the meaning of Julid
  -h, --help            Print help
  -V, --version         Print version

The whole program is just 34 lines, so check it out.

The default optional Cargo features include implementations of traits for getting Julids into and out of SQLite with SQLx, and for generally serializing/deserializing with Serde, via the sqlx and serde features, respectively.

Something to note: don't enable the plugin feature in your Cargo.toml if you're using this crate inside your Rust application, especially if you're also loading it as an extension in SQLite in your application. You'll get a long and confusing runtime panic due to there being multiple entrypoints defined with the same name.

Thanks

This project wouldn't have happened without a lot of inspiration (and a little shameless stealing) from the ulid-rs crate. For the loadable extension, the sqlite-loadable-rs crate made it extremely easy to write; what I thought would take a couple days instead took a couple hours. Thank you, authors of those crates! Feel free to steal code from me any time!

Dependencies

~32–45MB
~756K SLoC