#knn #ml #neighbor #prediction #ball-tree

ball-tree

Ball-tree implementation for K-nearest neighbors

3 unstable releases

✓ Uses Rust 2018 edition

0.2.0 Jun 18, 2019
0.1.1 Nov 12, 2018
0.1.0 Nov 11, 2018

#334 in Data structures

Download history 12/week @ 2019-07-15 32/week @ 2019-07-22 17/week @ 2019-07-29 63/week @ 2019-08-05 29/week @ 2019-08-12 34/week @ 2019-08-19 32/week @ 2019-08-26 26/week @ 2019-09-02 20/week @ 2019-09-09 55/week @ 2019-09-16 29/week @ 2019-09-23 4/week @ 2019-09-30 3/week @ 2019-10-14 10/week @ 2019-10-21

171 downloads per month

MIT license

15KB
234 lines

A BallTree is a space-partitioning data-structure that allows for finding nearest neighbors in logarithmic time.

It does this by partitioning data into a series of nested bounding spheres ("balls" in the literature). Spheres are used because it is trivial to compute the distance between a point and a sphere (distance to the sphere's center minus thte radius). The key observation is that a potential neighbor is necessarily closer than all neighbors that are located inside of a bounding sphere that is farther than the aforementioned neighbor.

Graphically:


   A -  
   |  ----         distance(A, B) = 4
   |      - B      distance(A, S) = 6
    |       
     |
     |    S
       --------
     /        G \ 
    /   C        \
   |           D |
   |       F     |
    \ E         /
     \_________/

In the diagram, A is closer to B than to S, and because S bounds C, D, E, F, and G, it can be determined that A it is necessarily closer to B than the other points without even computing exact distances to them.

Ball trees are most commonly used as a form of predictive model where the points are features and each point is associated with a value or label. Thus, This implementation allows the user to associate a value with each point. If this functionality is unneeded, () can be used as a value.

This implementation returns the nearest neighbors, their distances, and their associated values. Returning the distances allows the user to perform some sort of weighted interpolation of the neighbors for predictive purposes.

No runtime deps