#hash #merkle-tree #prime-field #sha-3 #crypto

no-std winter-crypto

Cryptographic library for the Winterfell STARK prover/verifier

19 releases

0.8.3 Mar 15, 2024
0.7.0 Oct 23, 2023
0.6.4 May 26, 2023
0.6.1 Mar 29, 2023
0.2.0 Aug 24, 2021

#51 in Cryptography

Download history 1560/week @ 2024-01-05 2431/week @ 2024-01-12 2279/week @ 2024-01-19 2444/week @ 2024-01-26 2142/week @ 2024-02-02 2054/week @ 2024-02-09 1793/week @ 2024-02-16 1712/week @ 2024-02-23 2497/week @ 2024-03-01 2037/week @ 2024-03-08 1998/week @ 2024-03-15 1887/week @ 2024-03-22 1891/week @ 2024-03-29 2115/week @ 2024-04-05 1965/week @ 2024-04-12 1514/week @ 2024-04-19

7,732 downloads per month
Used in 46 crates (8 directly)

MIT license

540KB
9K SLoC

Winter crypto

This crate contains modules with cryptographic operations needed in STARK proof generation and verification.

Hash

Hash module defines a set of hash functions available for cryptographic operations. Currently, the following hash functions are supported:

  • SHA3 with 256-bit output.
  • BLAKE3 with either 256-bit or 192-bit output. The smaller output version can be used to reduce STARK proof size, however, it also limits proof security level to at most 96 bits.
  • Rescue Prime over a 64-bit field with 256-bit output and over a 62-bit field with 248-bit output. Rescue is an arithmetization-friendly hash function and can be used in the STARK protocol when recursive proof composition is desired. However, using this function is not yet supported by the Winterfell STARK prover and verifier.
  • Rescue Prime over the same 64-bit field as above, with 256-bit output, but using the novel Jive compression mode to obtain a smaller state and faster 2-to-1 compression.

Rescue hash function implementation

Rescue hash function is implemented according to the Rescue Prime specifications with the following exception:

  • We set the number of rounds to 7, which implies a 40% security margin instead of the 50% margin used in the specifications (a 50% margin rounds up to 8 rounds). The primary motivation for this is that having the number of rounds be one less than a power of two simplifies AIR design for computations involving the hash function.
  • When hashing a sequence of elements, we do not append Fp(1) followed by Fp(0) elements to the end of the sequence as padding. Instead, we initialize one of the capacity elements to the number of elements to be hashed, and pad the sequence with Fp(0) elements only. This ensures that output of the hash function is the same when we hash 8 field elements to compute a 2-to-1 hash using merge() function (e.g., for building a Merkle tree) and when we hash 8 field elements as a sequence of elements using hash_elements() function. However, this also means that our instantiation of Rescue Prime cannot be used in a stream mode as the number of elements to be hashed must be known upfront.
  • For instantiation RP64_256, we also make the following modifications:
    • We use the first 4 elements of the state (rather than the last 4 elements of the state) for capacity and the remaining 8 elements for rate. The output of the hash function comes from the first four elements of the rate portion of the state (elements 4, 5, 6, and 7). This effectively applies a fixed bit permutation before and after XLIX permutation. We assert without proof that this does not affect the security of the construction.
    • Instead of using Vandermonde matrices as a standard way of generating an MDS matrix as described in Rescue Prime paper, we use a methodology developed by Polygon Zero to find an MDS matrix with coefficients which are small powers of two in frequency domain. This allows us to dramatically reduce MDS matrix multiplication time. Using a different MDS matrix does not affect security of the hash function as any MDS matrix satisfies Rescue Prime construction (as described in section 4.2 of the paper).
  • The RPJive64_256 instantiation of Rescue Prime using Jive as compression mode implements similar modifications as Rp64_256, at the exception of its padding rule which implements the Hirose padding. In addition, because of the use of Jive, the output of the hash function is not the same when we hash 8 field elements as a sequence of elements using hash_elements() function and when we compress 8 field elements into 4 (e.g., for building a Merkle tree) using the 2-to-1 Jive compression mode.

The parameters used to instantiate the functions are:

  • For RP64_256:
    • Field: 64-bit prime field with modulus 264 - 232 + 1.
    • State width: 12 field elements.
    • Capacity size: 4 field elements.
    • Digest size: 4 field elements (can be serialized into 32 bytes).
    • Number of founds: 7.
    • S-Box degree: 7.
    • Target security level: 128-bits.
  • For RPJive64_256:
    • Field: 64-bit prime field with modulus 264 - 232 + 1.
    • State width: 8 field elements.
    • Capacity size: 4 field elements.
    • Digest size: 4 field elements (can be serialized into 32 bytes).
    • Number of founds: 7.
    • S-Box degree: 7.
    • Target security level: 128-bits.
  • For RP62_248:
    • Field: 62-bit prime field with modulus 262 - 111 * 239 + 1.
    • State width: 12 field elements.
    • Capacity size: 4 field elements.
    • Digest size: 4 field elements (can be serialized into 31 bytes).
    • Number of founds: 7.
    • S-Box degree: 3.
    • Target security level: 124-bits.

Hash function performance

One of the core operations performed during STARK proof generation is construction of Merkle trees. We care greatly about building these trees as quickly as possible, and thus, for the purposes of STARK protocol, 2-to-1 hash operation (e.g., computing a hash of two 32-byte values) is especially important. The table below contains rough benchmarks for computing a 2-to-1 hash for all currently implemented hash functions.

CPU BLAKE3_256 SHA3_256 RP64_256 RPJ64_256 RP62_248
Apple M1 Pro 76 ns 227 ns 5.1 us 3.8 us 7.1 us
AMD Ryzen 9 5950X @ 3.4 GHz 62 ns 310 ns 5.2 us 3.9 us 6.9 us
Core i9-9980KH @ 2.4 GHz 66 ns 400 ns - - 6.6 us
Core i5-7300U @ 2.6 GHz 81 ns 540 ns - - 9.5 us
Core i5-4300U @ 1.9 GHz 106 ns 675 ns - - 13.9 us

As can be seen from the table, BLAKE3 is by far the fastest hash function, while our implementations of algebraic hashes are 70x slower than BLAKE3 and 20x slower than SHA3.

Merkle

Merkle module contains an implementation of a Merkle tree which supports batch proof generation and verification. Batch proofs are based on the Octopus algorithm described here.

Crate features

This crate can be compiled with the following features:

  • std - enabled by default and relies on the Rust standard library.
  • concurrent - implies std and also enables multi-threaded execution for some of the crate functions.
  • no_std does not rely on the Rust standard library and enables compilation to WebAssembly.

To compile with no_std, disable default features via --no-default-features flag.

Concurrent execution

When compiled with concurrent feature enabled, the following operations will be executed in multiple threads:

  • MerkleTree::new() - i.e., a Merkle tree will be constructed in multiple threads.

The number of threads can be configured via RAYON_NUM_THREADS environment variable, and usually defaults to the number of logical cores on the machine.

License

This project is MIT licensed.

Dependencies

~3MB
~52K SLoC