#similarity #search #knn #distance

vpsearch

Vantage Point Tree search algorithm for fast nearest neighbour search in multi-dimensional metric spaces

12 stable releases

2.0.1 Apr 24, 2020
1.3.6 Apr 7, 2020
1.3.5 May 28, 2018
1.3.4 Aug 19, 2017
0.5.0 Jun 14, 2015

#106 in Algorithms

Download history 64/week @ 2021-04-04 73/week @ 2021-04-11 113/week @ 2021-04-18 35/week @ 2021-04-25 105/week @ 2021-05-02 109/week @ 2021-05-09 108/week @ 2021-05-16 182/week @ 2021-05-23 127/week @ 2021-05-30 165/week @ 2021-06-06 102/week @ 2021-06-13 200/week @ 2021-06-20 223/week @ 2021-06-27 118/week @ 2021-07-04 287/week @ 2021-07-11 197/week @ 2021-07-18

284 downloads per month
Used in 2 crates (via undither)

BSD-2-Clause

21KB
267 lines

VP-tree nearest neighbor search

A relatively simple and readable Rust implementation of Vantage Point tree search algorithm.

The VP tree algorithm doesn't need to know coordinates of items, only distances between them. It can efficiently search multi-dimensional spaces and abstract things as long as you can define similarity between them (e.g. points, colors, and even images).

Please see the API reference or examples for details.

This algorithm does not work with squared distances. When implementing Euclidean distance, you MUST use sqrt(). Vantage Point trees require metric spaces.

#[derive(Copy, Clone)]
struct Point {
    x: f32, y: f32,
}

/// `MetricSpace` makes items comparable. It's a bit like Rust's `PartialOrd`.
impl vpsearch::MetricSpace for Point {
    type UserData = ();
    type Distance = f32;

    fn distance(&self, other: &Self, _: &Self::UserData) -> Self::Distance {
        let dx = self.x - other.x;
        let dy = self.y - other.y;

        // You MUST use sqrt here! The algorithm will give wrong results for squared distances.
        (dx*dx + dy*dy).sqrt()
    }
}

fn main() {
    let points = vec![Point{x:2.0,y:3.0}, Point{x:0.0,y:1.0}, Point{x:4.0,y:5.0}];

    let vp = vpsearch::Tree::new(&points);
    let (index, _) = vp.find_nearest(&Point{x:1.0,y:2.0});

    println!("The nearest point is at ({}, {})", points[index].x, points[index].y);
}

Implementing MetricSpace for Rust built-in types

This library includes a workaround for orphan rules. You need to add your crate's type when implementing MetricSpace:

struct MyImpl; // it can be any type, as long as it's yours
impl vpsearch::MetricSpace<MyImpl> for Vec<u8> {
    // continue as usual
}

Memory efficiency tip

Tree clones all the items and puts them in the tree. If the items are too big to clone and you'd rather keep the items elsewhere, you can!

Instead of storing the items, make the tree store indices into your items storage, and pass actual items as user_data to your MetricSpace::distance() function.

let items = /* your actual items are here */;
let indices: Vec<_> = (0...items.len() as u32).collect();
let tree = Tree::new_with_user_data_ref(&items, &indices);
let res = tree.find_nearest(&needle, &items);

Dependencies

~140KB