#ann #knn #nearest-neighbors

acap

As Close As Possible — nearest neighbor search in Rust

2 releases

✓ Uses Rust 2018 edition

0.1.1 Jun 30, 2020
0.1.0 Jun 24, 2020

#268 in Algorithms

MIT license

96KB
2.5K SLoC

acap

crates.io Documentation License Build Status

As Close As Possible — nearest neighbor search in Rust.

Example

use acap::euclid::Euclidean;
use acap::vp::VpTree;
use acap::NearestNeighbors;

let tree = VpTree::balanced(vec![
    Euclidean([3, 4]),
    Euclidean([5, 12]),
    Euclidean([8, 15]),
    Euclidean([7, 24]),
]);

let nearest = tree.nearest(&[7, 7]).unwrap();
assert_eq!(nearest.item, &Euclidean([3, 4]));
assert_eq!(nearest.distance, 5);

lib.rs:

As Close As Possible — nearest neighbor search in Rust.

Overview

The notion of distances between points is captured by the [Proximity] trait. Its distance() method returns a [Distance], from which the actual numerical distance may be retrieved with value(). These layers of abstraction allow acap to work with generically with different distance functions over different types.

There are no restrictions on the distances computed by a [Proximity]. For example, they don't have to be symmetric, subadditive, or even positive. Implementations that do have these desirable properties will additionally implement the [Metric] marker trait. This distinction allows acap to support a wide variety of useful metric and non-metric distances.

As a concrete example, consider Euclidean<[i32; 2]>. The [Euclidean] wrapper equips any type that has coordinates with the Euclidean distance function as its [Proximity] implementation:

 use acap::distance::Proximity;
 use acap::euclid::Euclidean;

 let a = Euclidean([3, 4]);
 let b = Euclidean([7, 7]);
 assert_eq!(a.distance(&b), 5);

In this case, distance() doesn't return a number directly; as an optimization, it returns a [EuclideanDistance] wrapper. This wrapper stores the squared value of the distance, to avoid computing square roots until absolutely necessary. Still, it transparently supports comparisons with numerical values:

 # use acap::distance::Proximity;
 # use acap::euclid::Euclidean;
 # let a = Euclidean([3, 4]);
 # let b = Euclidean([7, 7]);
 use acap::distance::Distance;

 let d = a.distance(&b);
 assert!(d > 4 && d < 6);
 assert_eq!(d, 5);
 assert_eq!(d.value(), 5.0f32);

For finding the nearest neighbors to a point from a set of other points, the [NearestNeighbors] trait provides a uniform interface to many different similarity search data structures. One such structure is the vantage-point tree, available in acap as VpTree:

 # use acap::euclid::Euclidean;
 use acap::vp::VpTree;
 use acap::NearestNeighbors;

 let tree = VpTree::balanced(vec![
     Euclidean([3, 4]),
     Euclidean([5, 12]),
     Euclidean([8, 15]),
     Euclidean([7, 24]),
 ]);

VpTree implements [NearestNeighbors], which has a nearest() method that returns an optional [Neighbor]. The [Neighbor] struct holds the actual neighbor it found, and the distance it was from the target:

 # use acap::euclid::Euclidean;
 # use acap::vp::VpTree;
 # use acap::NearestNeighbors;
 # let tree = VpTree::balanced(
 #     vec![Euclidean([3, 4]), Euclidean([5, 12]), Euclidean([8, 15]), Euclidean([7, 24])]
 # );
 let nearest = tree.nearest(&[7, 7]).unwrap();
 assert_eq!(nearest.item, &Euclidean([3, 4]));
 assert_eq!(nearest.distance, 5);

[NearestNeighbors] also provides the nearest_within(), k_nearest(), and k_nearest_within() methods which find up to k neighbors within a possible threshold.

It can be expensive to compute nearest neighbors exactly, especially in high dimensions. For performance reasons, [NearestNeighbors] implementations are allowed to return approximate results. Many implementations have a speed/accuracy tradeoff which can be tuned. Those implementations which always return exact results will also implement the [ExactNeighbors] marker trait. For example, a VpTree will be exact when the [Proximity] function is a [Metric].

Dependencies

~140KB