### 2 releases

0.1.1 | Nov 16, 2018 |
---|---|

0.1.0 | Nov 13, 2018 |

#**230** in Data structures

Used in **1** crate

**MIT**license

230KB

1K
SLoC

# Manhattan Tree

The manhattan tree is a spatial tree, based on the octree, that allows logarithmic time
insertion, removal, and **querying for the element with the lowest manhattan distance
to an arbitrary point in space**.

### Octree

The design of this tree is based on the octree.

The total domain of the coordinate space is represented as a cube. This cube can be recursively sub-divided into 8 octants.

We introduce two types for tracking these octants:

- The

, containing unsigned integer coordinates,`BaseCoord`

.`[``u64``;``3``]` - The

, containing unsigned integer coordinates,`OctCoord`

, as well as scale factor,`[``u64``;``3``]`

.`u8`

With these coordinates, we can define the two variants for an

(a node of the tree):`Octant`

, which contains a`Leaf`

and some element that the tree is storing`BaseCoord`

, which contains an`Branch`

and 8 optional child`OctCoord`

s`Octant`

In the case of the branch, in each dimension, the branch's domain covers the coordinate range
from

(inclusive) to `coordinates [n] * (2 pow scale)`

`(`coordinates`[`n`]` `+` `1``)` `*` `(``2` pow scale`)`

(exclusive).From here, we can lay out a couple important invariants:

- Only leaves have elements
- Only branches have children
- An octant's domain is always a subdomain of its parent
- A branch always has at least two children
- Each of a branch's children are in a different suboctant of the branch

### Tree shortening

This repository implements the manhattan tree in 3 dimensions, but it actually can work in any dimension. Let's take the example of a 1-dimensional manhattan tree.

A balanced 1-dimensional manhattan tree with 32 elements would look like this:

`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``||``-``-``-``-``|`
`|``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``||``-``|`
`0` `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12` `13` `14` `15` `16` `17` `18` `19` `20` `21` `22` `23` `24` `25` `26` `27` `28` `29` `30` `31`

A 1-dimensional manhattan tree has a branch factor of

, and `2` pow `1` `==` `2`

,
and this tree has 5 branching layers (not including the root layer, or alternatively,
not including the leaf layer), so it could be considered maximally efficient.`log base 2 (32) == 5`

However, let's take a case where there are less elements, and we handle this *incorrectly*:

`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|` `|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``|` `|``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``|` `|``-``-``-``-``||``-``-``-``-``|`
`|``-``||``-``|` `|``-``|` `|``-``|`
`6` `7` `22` `24`

We have reduced the number of elements, but our tree is still just as tall. And remember, the time complexity of a tree is generally dependent on its height. We want to shorten the tree while maintaining invariants 1 and 3:

Only leaves have elements

An octant's domain is always a subdomain of its parent

Furthermore, this tree structure severely breaks invariant 4:

A branch always has at least two children

Fortunately, invariant 3 does not mean that an octant's domain must be a direct subdomain of its parent; it can be further down the chain. As such, we can shorten the tree by simply cutting out the branches which only have one child, thus actualizing invariant 4, as such:

`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``||``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``-``|`
`|``-``||``-``|` `|``-``|` `|``-``|`
`6` `7` `22` `24`

Now, the tree only has a height of three, and each branch has at least two children. Remember that, that although it's now harder to see, each leaf is still in a different suboctant of its parent branch.

### Bounds tracking

We include an additional piece of information in each octant, which was not previously mentioned. For each branch, and for each of the three dimensions, we encode the minimum and maximum component in that dimension stored in the entire subtree. This data must be properly maintained upon insertions or removal into the manhattan tree.

### Manhattan distance

Given these invariants, we can efficiently perform the manhattan tree's namesake operation:
**find the element with closest manhattan distance to an arbitrary focus**.

The focus is some arbitrary coordinate, just like any other key in the tree.

As such, the function for closest element by manhattan distance to a focus, in Rust-like pseudo code, is as follows:

`fn` `closest``(``node``:` Octant, `focus``:` BaseCoord, `competitor``:` `Option``<`BaseCoord`>``)`` ``->` `Option``<`Leaf`>``:`
`if` node is a leaf`:`
`return` the leaf`,` unless its mahattan dist `(`henceforth mdist`)` to focus is greater
`...`than the mdist between focus and competitor`,` `if` competitor exists
`if` node is a branch`:`
`//` here we take advantage of bounds to short circuit, which is the main performance boost
`if` competitor exists`:`
`for` each of the three dimensions`:`
`let` a `=` competitor`.``mdist``(`focus`)`
`let` b `=` the distance `in` that one dimension between focus`,` and node`'s`
`...`closest element to focus`,` according to the bounds tracking
`if` a `<` b`:`
`return` `None`
`let` best `=` `None`
`for` each child`,` ordered nearest to furthest from focus`:`
`let` child_competitor `=` `{`
`if` best exists`:` `Some``(`best`'s` coord`)`
`else` `if` competitor exists`:` `Some``(`competitor`)`
`else``:` `None`
`}`
`if` `let` `Some``(`better`)` `=` `closest``(`child`,` focus`,` child_competitor`)``:`
best `=` `Some``(`better`)`
`return` best

### Bit-level hacking

Many of these operations, which are brushed over by abstract explanations and pseudocode, require operations on coordinates, such as:

- Which suboctant of an

contains a certain`OctCoord``BaseCoord` - Which suboctant of an

is closest to a certain`OctCoord``BaseCoord` - Convert a

into a`OctCoord``BaseCoord` - What is the lowest common

for which two`OctCoord`

s are suboctants`BaseCoord`

Fortunately, because coordinates under the hood are stored as unsigned binary integers,
and because recursive binary subdivision is performed on each dimension, the binary
properties of the unsigned integers can be exploited to implement all of these operations
using bitwise manipulation. This is best expressed in actual code, which can be found in
the

module of this crate.`tree /coord`

### Performance

I ran a performance test on the tree, testing insertion, removal, and closest element querying,
with different sizes of trees. To minimize external variables, these tests were run on a

server, rented from Amazon EC2, running Ubuntu server 18.04.`t2 .medium`

These results clearly show a logarithmic time complexity. Running an additional test, going up to a tree size of 100,000, I manually fitted a natural logarithm function to the data.

The curve appears to closely follow the relation:

`time = 2220ns * ln(size) + 5200ns`

#### Dependencies

~450KB