#random #no-alloc #no-std #stream-cipher

no-std chacha8rand

Reproducible, robust and (last but not least) fast pseudorandomness

1 unstable release

0.1.0 Oct 14, 2024

#1420 in Algorithms

MIT/Apache

88KB
1K SLoC

ChaCha8Rand Implementation in Rust

Reproducible, robust and (last but not least) fast pseudorandomness.

This crate implements the chacha8rand specification, originally designed for Go's math/rand/v2 package. The language-independent specification and test vector helps with long-term reproducibility and interoperability. Building on the ChaCha8 stream cipher ensures high statistical quality and removes entire classes of "you're holding it wrong"-style problems that lead to sub-par output. It's also carefully designed and implemented (using SIMD instructions when available) to be so fast that it shouldn't ever be a bottleneck. However, it should not be used for cryptography.

See the documentation for more details.

Dual-licensed under Apache 2.0 or MIT at your option.


lib.rs:

Reproducible, robust and (last but not least) fast pseudorandomness.

This crate implements the ChaCha8Rand specification, originally designed for Go's math/rand/v2 package. The language-independent specification and test vector helps with long-term reproducibility and interoperability. Building on the ChaCha8 stream cipher ensures high statistical quality and removes entire classes of "you're holding it wrong"-style problems that lead to sub-par output. It's also carefully designed and implemented (using SIMD instructions when available) to be so fast that it shouldn't ever be a bottleneck. However, it should not be used for cryptography.

Quick Start

In the interest of simplicity and reproducibility, there's no global or thread-local generator. You'll always have to pick a 32-byte seed yourself, create a ChaCha8Rand instance from it, and pass it around in your program. Usually, you'll generate an unpredictable seed at startup by default, but store or log it somewhere and support running the program again with the same seed. For the first half, it's usually best to provide a full 256 bits of entropy via the getrandom crate:

use chacha8rand::ChaCha8Rand;

let mut seed = [0; 32];
getrandom::getrandom(&mut seed).expect("getrandom failure is 'highly unlikely'");
let mut rng = ChaCha8Rand::new(&seed);
// Now we can make random choices
let heads_or_tails = if rng.read_u32() & 1 == 0 { "heads" } else { "tails" };
println!("The coin came up {heads_or_tails}.");

The best place and format to store the seed will vary, but 64 hex digits is a good default because it can be copied and pasted as (technically) human-readable text. However, if you want to let humans pick a seed by hand for any reason, then asking them for exactly 64 hex digits would be a bit rude. For such cases, it's more convenient to accept an UTF-8 string and feed it into a hash function with 256 bit output, such as SHA-256 or Blake3.

In any case, once you've created a ChaCha8Rand instance with an initial seed, you can consume its output as a sequence of bytes or as stream of 32-bit or 64-bit integers. If you need support for other types, for integers in a certain interval, or other distributions, you might want to enable the crate feature to combine ChaCha8Rand with the rand crate. Another thing you can do (even without rand) is deriving seeds for multiple sub-RNGs that are used for different purposes, without creating correlation between those different streams of randomness. The ability to do this with confidence is one reason why I decided to implement ChaCha8Rand in the first place, so there's a little helper for it:

use chacha8rand::ChaCha8Rand;

let mut seed_gen = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
// Create new instances with seeds from `seed_gen`...
let mut rng1 = ChaCha8Rand::new(&seed_gen.read_seed());
let mut rng2 = ChaCha8Rand::new(&seed_gen.read_seed());
assert_ne!(rng1.read_u64(), rng2.read_u64());
// ... and/or re-seed an existing instance in-place:
rng1.set_seed(&seed_gen.read_seed());

Note that using the output of a statistical RNG to seed other instances of the same algorithm (or a related one) is often risky or outright broken. Even generators that explicitly support it, like SplitMix, often distinguish "generate a new seed" from ordinary random output. ChaCha8Rand has no such caveats: its state space is so large, and its output is of such high quality, that there's no risk of creating overlapping output sequences or correlations between generators seeded this way. Indeed, every instance regularly replaces its current seed with some of its own output. Using the rest of the output as seeds for other instances works just as well.

Don't Use This For Cryptography

ChaCha8Rand derives its high quality from ChaCha8, which is a secure stream cipher as far as anyone knows today (although in most cases you also want ciphertext authenticity, i.e., an AEAD mode). Thus, ChaCha8Rand can mostly be used as a black-box source of high quality pseudorandomness. If there were any patterns or biases in its output, or if the output sequences for different seeds (with some known relation between them) were not statistically independent, that would most likely imply a major breakthrough in the cryptanalysis of ChaCha. However, that doesn't mean this crate is a replacement for cryptographically secure randomness from the operating system or libraries that wrap it, such as getrandom.

As Russ Cox and Filippo Valsorda wrote while introducing the algorithm, regarding accidental use of Go's math/rand to generate cryptographic keys and other secrets:

Using Go 1.20, that mistake is a serious security problem that merits a detailed investigation to understand the damage. [...] Using Go 1.22, that mistake is just a mistake. It’s still better to use crypto/rand, because the operating system kernel can do a better job keeping the random values secret from various kinds of prying eyes, the kernel is continually adding new entropy to its generator, and the kernel has had more scrutiny. But accidentally using math/rand is no longer a security catastrophe.

Keep in mind that Go has a global generator which is seeded from OS-provided entropy on startup. If you pick a seed yourself (which you always do when using this crate), the output of the generator is at best as unpredictable as that seed was. There are also other design decisions in this implementations that would be inappropriate for security-sensitive applications. For example, it doesn't handle process forking or VM image cloning, it doesn't even try to scrub generated data from its internal buffer after it's consumed, and it sacrifices so-called fast key erasure in favor of needing fewer bytes to serialize the current state.

Crate Features

The crate is no_std and "no alloc" by default. There are currently two crate features you might enable when depending on chacha8rand. You can manually add them to Cargo.toml (features = [...] key) or use a command like cargo add chacha8rand -F rand_core_0_6. The features are:

  • std: opts out of #![no_std], enables runtime detection of target_features for higher performance on some targets. It does not (currently) affect the API surface, so ideally libraries leave this decision to the top-level binary. For forward compatibility, enabling this feature always adds a dependency on std, even on targets where std isn't needed today.
  • rand_core_0_6: implement the RngCore and SeedableRng traits from rand_core v0.6, for integration with rand v0.8. The upcoming v0.9 release of the rand crates will get another feature so that ChaCha8Rand can implement both the new and the old versions of these traits at the same time.

Neither feature is enabled by default, so you don't need no-default-features = true / cargo add --no-default-features. In fact, please don't, because then your code might break if a later version moves existing functionality under a new on-by-default feature.

There are also some features with an "unstable" prefix in their name. Anything covered by these is for internal use only (e.g., the crate's benchmarks are compiled as a separate crate) and explicitly not covered by SemVer.

Minimum Supported Rust Version (MSRV)

There is no MSRV policy at the moment, so features from new stable Rust versions may be adopted as soon as they come out (but in practice I don't expect to make frequent releases). If you need to use this crate with a specific older version, you can open an issue and we can take a look at how easy or difficult it would be to support that version.

Drawbacks

The main reasons why you might not want to use this crate are the use of unsafe for accessing SIMD intrinsics and the relatively large buffer (4x larger than the Go implementation). The latter means each RNG instance is a little over a thousand bytes large, which may be an issue if you want to have many instances and care about memory consumption and/or only consume a small amount of randomness from most of those instances.

Dependencies