#tree-node #node-tree #memory-layout #optimization #pointers #data

nightly bonzai

An abstraction for optimizing the memory layout and pointer aliasing of trees

4 releases

Uses old Rust 2015

0.2.1 Nov 11, 2018
0.2.0 Nov 11, 2018
0.1.1 Nov 9, 2018
0.1.0 Nov 9, 2018

#1594 in Data structures


Used in 2 crates (via manhattan-tree)

MIT license

80KB
1.5K SLoC

Bonzai

Part of Rust's impressive performance comes from Rust storing data in-place by default, which improves spatial locality, causing the CPU-cache to perform better. Trees, however, often miss out on this benefit. Trees are recursive structures, so they need to use a form of indirection. People tend to default to the Box, since it has the same ownership semantics as normal, owned data. However, it has the potential to splay data all over the heap, killing spatial locality.

One technique for resolving this issue is storing a tree's nodes in a contiguous Vec, and connecting them with indices. The most direct approach is to implement indexing logic in a way that is highly coupled with the logic of the tree itself. Unfortunately, this approach bears a striking resemblance to dealing with raw pointers, both in the bugs it can allow and the technical complexity is creates.

Bonzai is a safe, zero-cost abstraction over trees which store their nodes in a contiguous Vec. Bonzai uses unsafe code to create a safe abstraction which leverages the borrow checker to verify the validity of operations on trees, while using very few heap allocations.

Features:

  • Entirely safe interface
  • Minimal runtime cost
  • Improved pointer aliasing
  • Multiple simultaneous mutable references to different parts of tree
  • Traversing from nodes to their parents
  • Detach a subtree, reattach it somewhere else
  • Compaction of nodes and re-shrinking of memory footprint
  • Tree is Send and Sync if element is
  • Compile-time generic over branch factor
  • Pretty-printing trees through Debug trait

Unsupported at this time:

  • Multithreaded mutation
  • Use on stable release channel
  • Unbounded/dynamic branch factor
  • Two-way traversal of detached subtree

Example, performance test

The bonzai-nbst repository is a rust executable which contains implementations of a naive binary search tree (no balancing), using bonzai in one, and heap allocations in the other. The bonzai-based implementation.

I tested the two implementations with 10,000,000 random tree operations, on a windows 10 laptop. To test it in a real-world scenario with chaotic heap use, I plugged it into a fork of the magog roguelike (which I am not associated with), such that they run in the same process. While this test is very unscientific, bonzai demonstrated a clear performance improvement:

  • bonzai: 15,786 ms
  • boxes: 21,338 ms

Tree<T, C>

The Tree, and nearly all components borrowed from the tree, is generic over two types: the element type, and a sized array of ChildId. The element type, T, is stored for every node in the tree. Each node may have any combination of children, with any child index that is a valid index of the ChildId array.

For example, a binary search tree set of i32 could be represented as a Tree<i32, [ChildId; 2]>.

TreeOperation

A TreeOperation is a type which borrows mutably from the Tree, and allows for modification to that tree. While a TreeOperation exists, the access to the tree can only be single-threaded. This allows many operations to mutate the tree with only a immutable reference to the TreeOperation, directly or indirectly.

NodeOwnedGuard

Normally, a node in a tree is owned by its parent. The root node has a special status, where its parent is considered to be root. However, during the lifetime of a TreeOperation, a subtree can be detached from its parent, and it can be considered to be owned by a NodeOwnedGuard. In this case, the NodeOwnedGuard can outlive the guard for its parent. This subtree can be reattached to the main tree as the child of another node. During the lifetime of the NodeOwnedGuard, it is possible to mutate it, as if it were part of the main tree. Alternatively, a NodeOwnedGuard can be transformed into its inner element, marking the node and all its children as garbage.

When a NodeOwnedGuard is dropped, and its subtree is marked as garbage, the elements' destructors will run sometime between the dropping of the NodeOwnedGuard and the dropping of the TreeOperation.

NodeWriteGuard

A NodeWriteGuard is a type which holds mutable access to a subset of the tree (a node and all its children, recursively). The NodeWriteGuard can be simultaneously borrowed into a mutable reference to the element (&mut T) and a ChildWriteGuard. The ChildWriteGuard has the ability to alter the node's children. Additionally, a NodeWriteGuard can be turned into a NodeOwnedGuard, detaching the guarded subtree.

A NodeWriteGuard cannot outlive the parent node guard (if the node is root, it cannot outlive the tree).

ChildWriteGuard

A ChildWriteGuard represents mutable access to a node's children. It can be borrowed from both a NodeWriteGuard and a NodeOwnedGuard. The ChildWriteGuard can borrow out a NodeWriteGuard to its child, or even several different children simultaneously. Additionally, a ChildWriteGuard can put an element as a particular child (marking any previous child subtree as garbage), or even attach an entire detached subtree (a NodeOwnedGuard) as one of its children.

TreeWriteTraverser

While a NodeWriteGuard holds mutable access to a subtree (a node and its children, recursively), a TreeWriteTraverser holds mutable access to an entire tree. This allows a TreeWriteTraverser to seek the parent node, which a NodeWriteGuard cannot do. However, this also means that a TreeWriteTraverser cannot exist at the same time as any other guards.

Beyond that difference, a TreeWriteTraverser can be used similarly to a NodeWriteGuard. A TreeWriteTraverser can even be turned or mutably borrowed into a NodeWriteGuard.

A TreeOperation and some type of node guard can be conveniently turned into a TreeWriteTraverser with the traverse_from! macro.

NodeReadGuard

A NodeReadGuard is a type which holds immutable access to a subset of the tree. Because a NodeReadGuard doesn't mutably borrow anything, it does not need to split into a ChildWriteGuard and &mut T. Instead, the NodeReadGuard immutably dereferences to a T, and its children can be accessed directly with a method.

TreeReadTraverser

The TreeReadTraverser is to the NodeReadGuard as the TreeWriteTraverser is to the NodeWriteGuard. While the NodeReadGuard holds immutable access to a subset of the tree, a TreeReadTraverser holds immutable access to the entire tree. This allows the TreeReadTraverser to safely traverse to its parent node.

No runtime deps