2 stable releases
1.1.1 | Dec 8, 2024 |
---|---|
1.0.0 | Dec 7, 2024 |
#995 in Data structures
29KB
482 lines
small-world-rs
small-world-rs is an HNSW vector index written in Rust.
Features
- Fast, accurate and easy to implement
- Choose your precision (16 or 32 bit floats)
- Choose your distance metric
- Supports cosine distance (recommended for text) and euclidean distance (recommended for images)
- Serialize and deserialize for persistence
Example
See the text-embeddings example for a simple example of how to use small-world-rs to perform semantic search over a set of text embeddings.
Basically, it works like this:
- Get your embeddings, be that from OpenAI, Ollama, or wherever
- Create a
World
withWorld::new
orWorld::new_from_dump
- Insert your vectors into the world with
world.insert_vector
- Perform a search with
world.search
- Dump the world with
world.dump
to save for later
What config values should I use?
Key Parameters:
-
m
: Connections per layer- Recommended: 16-64
- Sweet spot: 32
- Higher values increase recall but consume more memory
-
ef_construction
: Construction-time exploration factor- Recommended: 100-500
- Trade-off: Higher values = better recall but slower build time
- Rule of thumb: 2-4× your target
ef_search
-
ef_search
: Query-time exploration factor- Recommended: 50-150
- Adjustable at search time
- Higher values increase accuracy but slow down search
- Tune based on recall requirements
Dependencies
~2.7–3.5MB
~87K SLoC