7 releases
0.4.17 | Apr 10, 2024 |
---|---|
0.4.15 | Apr 2, 2024 |
0.4.14 | Mar 25, 2024 |
0.4.11 | Feb 23, 2024 |
0.0.1 | Mar 18, 2023 |
#93 in Database implementations
662 downloads per month
260KB
5K
SLoC
LanceDB Rust
LanceDB Rust SDK, a serverless vector database.
Read more at: https://lancedb.com/
lib.rs
:
LanceDB is an open-source database for vector-search built with persistent storage, which greatly simplifies retrevial, filtering and management of embeddings.
The key features of LanceDB include:
- Production-scale vector search with no servers to manage.
- Store, query and filter vectors, metadata and multi-modal data (text, images, videos, point clouds, and more).
- Support for vector similarity search, full-text search and SQL.
- Native Rust, Python, Javascript/Typescript support.
- Zero-copy, automatic versioning, manage versions of your data without needing extra infrastructure.
- GPU support in building vector indices[^note].
- Ecosystem integrations with LangChain 🦜️🔗, LlamaIndex 🦙, Apache-Arrow, Pandas, Polars, DuckDB and more on the way.
[^note]: Only in Python SDK.
Getting Started
LanceDB runs in process, to use it in your Rust project, put the following in your Cargo.toml
:
cargo install lancedb
Quick Start
Connect to a database.
let db = lancedb::connect("data/sample-lancedb").execute().await.unwrap();
LanceDB accepts the different form of database path:
/path/to/database
- local database on file system.s3://bucket/path/to/database
orgs://bucket/path/to/database
- database on cloud object storedb://dbname
- Lance Cloud
You can also use ConnectOptions
to configure the connection to the database.
use object_store::aws::AwsCredential;
let db = lancedb::connect("data/sample-lancedb")
.aws_creds(AwsCredential {
key_id: "some_key".to_string(),
secret_key: "some_secret".to_string(),
token: None,
})
.execute()
.await
.unwrap();
LanceDB uses arrow-rs to define schema, data types and array itself.
It treats FixedSizeList<Float16/Float32>
columns as vector columns.
For more details, please refer to LanceDB documentation.
Create a table
To create a Table, you need to provide a arrow_schema::Schema
and a arrow_array::RecordBatch
stream.
use arrow_array::{RecordBatch, RecordBatchIterator};
use arrow_schema::{DataType, Field, Schema};
let schema = Arc::new(Schema::new(vec![
Field::new("id", DataType::Int32, false),
Field::new(
"vector",
DataType::FixedSizeList(Arc::new(Field::new("item", DataType::Float32, true)), 128),
true,
),
]));
// Create a RecordBatch stream.
let batches = RecordBatchIterator::new(
vec![RecordBatch::try_new(
schema.clone(),
vec![
Arc::new(Int32Array::from_iter_values(0..256)),
Arc::new(
FixedSizeListArray::from_iter_primitive::<Float32Type, _, _>(
(0..256).map(|_| Some(vec![Some(1.0); 128])),
128,
),
),
],
)
.unwrap()]
.into_iter()
.map(Ok),
schema.clone(),
);
db.create_table("my_table", Box::new(batches))
.execute()
.await
.unwrap();
Create vector index (IVF_PQ)
use lancedb::index::Index;
tbl.create_index(&["vector"], Index::Auto)
.execute()
.await
.unwrap();
Open table and run search
let results = table
.query()
.nearest_to(&[1.0; 128])
.unwrap()
.execute()
.await
.unwrap()
.try_collect::<Vec<_>>()
.await
.unwrap();
Dependencies
~73MB
~1.5M SLoC