### 7 unstable releases (3 breaking)

0.4.0 | May 9, 2020 |
---|---|

0.3.0 | May 2, 2020 |

0.2.3 | Apr 28, 2020 |

0.1.0 | Mar 13, 2020 |

#**1727** in Algorithms

**MIT**license

95KB

2K
SLoC

# lsh-rs (Locality Sensitive Hashing)

Locality sensitive hashing can help retrieving Approximate Nearest Neighbors in sub-linear time.

For more information on the subject see:

- Introduction on LSH
- Section 2. describes the hash families used in this crate
- LSH and neural networks

## Implementations

**Base LSH**- Signed Random Projections
*(Cosine similarity)* - L2 distance
- MIPS
*(Dot products/ Maximum Inner Product Search)* - MinHash
*(Jaccard Similarity)*

- Signed Random Projections
**Multi Probe LSH****Step wise probing**- SRP (only bit shifts)

**Query directed probing**- L2
- MIPS

- Generic numeric types

## Getting started

`use` `lsh_rs``::`LshMem`;`
`//` 2 rows w/ dimension 3.
`let` p `=` `&``[``vec!``[``1.``,` `1.``5``,` `2.``]``,`
`vec!``[``2.``,` `1.``1``,` `-``0.``3``]``]``;`
`//` Do one time expensive preprocessing.
`let` n_projections `=` `9``;`
`let` n_hash_tables `=` `30``;`
`let` dim `=` `10``;`
`let` dim `=` `3``;`
`let` `mut` lsh `=` `LshMem``::`new`(`n_projections`,` n_hash_tables`,` dim`)`\
`.``srp``(``)`
`.``unwrap``(``)``;`
lsh`.``store_vecs``(`p`)``;`
`//` Query in sublinear time.
`let` query `=` `&``[``1.``1``,` `1.``2``,` `1.``2``]``;`
lsh`.``query_bucket``(`query`)``;`

## Documentation

- Read the Rust docs.
- Read the Python docs for the Python bindings.

## Python

At the moment, the Python bindings are only compiled for Linux x86_64 systems.

`$`` pip install floky`

`from floky import ``SRP`
import numpy `as` np
N `=` `10000`
n `=` `100`
dim `=` `10`
`#` Generate some random data points
data_points `=` np`.`random`.``randn``(`N`,` dim`)`
`#` Do a one time `(`expensive`)` fit`.`
lsh `=` `SRP``(`n_projections`=``19``,` n_hash_tables`=``10``)`
lsh`.``fit``(`data_points`)`
`#` Query approximated nearest neigbors `in` sub`-`linear time
query `=` np`.`random`.``randn``(`n`,` dim`)`
results `=` lsh`.``predict``(`query`)`

#### Dependencies

~6–10MB

~197K SLoC