0.2.7	~~Aug 16, 2022~~
0.2.6	~~Aug 13, 2022~~
0.2.2	~~Jul 22, 2022~~
0.1.7	~~Jul 10, 2022~~
0.1.0	~~May 28, 2022~~

#26 in #accumulator

Used in 37 crates (3 directly)

Apache-2.0

1MB
18K SLoC

This module provides algorithms for accessing and updating a Merkle Accumulator structure persisted in a key-value store. Note that this doesn't write to the storage directly, rather, it reads from it via the HashReader trait and yields writes via an in memory HashMap.

Merkle Accumulator

Given an ever growing (append only) series of "leaf" hashes, we construct an evolving Merkle Tree for which proofs of inclusion/exclusion of a leaf hash at a leaf index in a snapshot of the tree (represented by root hash) can be given.

Leaf Nodes

Leaf nodes carry hash values to be stored and proved. They are only appended to the tree but never deleted or updated.

Internal Nodes

A non-leaf node carries the hash value derived from both its left and right children.

Placeholder Nodes

To make sure each Leaf node has a Merkle Proof towards the root, placeholder nodes are added so that along the route from a leaf to the root, each node has a sibling. Placeholder nodes have the hash value ACCUMULATOR_PLACEHOLDER_HASH

A placeholder node can appear as either a Leaf node or a non-Leaf node, but there is at most one placeholder leaf at any time.

Frozen Nodes & Non-frozen Nodes

As leaves are added to the tree, placeholder nodes get replaced by non-placeholder nodes, and when a node has all its descendants being non-placeholder, it becomes "Frozen" -- its hash value won't change again in the event of new leaves being added. All leaves appended (not counting the one possible placeholder leaf) are by definition Frozen.

Other nodes, which have one or more placeholder descendants are Non-Frozen. As new elements are appended to the accumulator the hash value of these nodes will change.

Leaf Count

Given a count of the number of leaves in a Merkle Accumulator it is possible to determine the shape of the accumulator -- which nodes are filled and which nodes are placeholder nodes.

Example: Logical view of a Merkle Accumulator with 5 leaves:

           Non-fzn
          /       \
         /         \
        /           \
      Fzn2         Non-fzn
     /   \           /   \
    /     \         /     \
   Fzn1    Fzn3  Non-fzn  [Placeholder]
  /  \    /  \    /    \
 L0  L1  L2  L3 L4   [Placeholder]

Position and Physical Representation

As a Merkle Accumulator tree expands to the right and upwards, we number newly frozen nodes monotonically. One way to do it is simply to use in-order index of nodes, and this is what we do for the in-memory representation. We call the stated numbers identifying nodes below simply "Position", and unless otherwise stated, this is the in-order position.

For writing to disk however, we write all the children of a node before the parent. Thus for disk write order, it is more convenient to use the post-order position as an index. And with that we can map a Merkle Accumulator into a key-value storage: key is the post-order position of a node, and the value is hash value it carries.

We store only Frozen nodes, and generate non-Frozen nodes on the fly when accessing the tree. This way, the physical representation of the tree is append-only, i.e. once written to physical storage, nodes won't be either modified or deleted.

Here is what we persist for the logical tree in the above example:

         Fzn2(6)
        /      \
       /        \
   Fzn1(2)       Fzn3(5)
  /     \       /     \
 L0(0)  L1(1)  L2(3)  L3(4)  L4(7)

When the next leaf node is persisted, the physical representation will be:

         Fzn2(6)
        /      \
       /        \
   Fzn1(2)       Fzn3(5)       Fzn4(9)
  /     \       /     \       /      \
 L0(0)  L1(1)  L2(3)  L3(4)  L4(7)   L5(8)

The numbering corresponds to the post-order traversal of the tree.

To think in key-value pairs:

|<-key->|<--value-->|
|   0   | hash_L0   |
|   1   | hash_L1   |
|   2   | hash_Fzn1 |
|  ...  |   ...     |

Dependencies

~50–68MB
~1.5M SLoC