2 releases
0.1.2 | Jan 21, 2020 |
---|---|
0.1.1 | Jul 28, 2018 |
#1611 in Data structures
172 downloads per month
15KB
239 lines
Hash Cons'ing for Rust
Sometimes, an Rc<T>
is insufficient for efficient, compact immmutable structures.
By contrast:
-
A
Merkle<T>
gives a compact serialization in the presence of sharing. -
A
Hc<T>
gives a unique representation in the presence of sharing.
Status
- The type
Merkle<_>
is implemented and tested. - The type
Hc<_>
is a minor variation; it remains as future work.
Background
Sometimes, we want a shared instance of some type T
that serializes
once, not once per reference, as is the case with the Rc
type.
Unlike a "bare" Rc<T>
, a Merkle<T>
enjoys the practical property
that, when a structure holding multiple (shared) instances of Merkle<T>
is
serialized, this serialized output holds only one occurrence of
each T
's serialized representation; the other occurrences merely
consist of the T
's unique identifier (the serialization of an
Id
, single machine word on modern machines).
Implementation summary
A Merkle<T>
has a unique ID (computed as a hash) that permits table-based indirection,
via temporary storage used by serialization and serialization logic.
By contrast, a bare Rc<T>
lacks this indirection, and thus, it lacks
a compact serialized representation for structures with abundant
sharing. Generally, abundant sharing via many shared Rc<_>
s leads to
exponential blow up in terms of serialized space and time.
Dependencies
~0.7–1.6MB
~35K SLoC