14 releases
0.4.0 | Jul 28, 2024 |
---|---|
0.3.3 | Jun 26, 2024 |
0.3.2 | Oct 30, 2023 |
0.2.2 | Jun 3, 2023 |
0.1.3 | Jun 23, 2021 |
#74 in Database interfaces
9,545 downloads per month
Used in 13 crates
(10 directly)
75KB
2K
SLoC
pgvector-rust
pgvector support for Rust
Supports Rust-Postgres, SQLx, and Diesel
Getting Started
Follow the instructions for your database library:
Or check out some examples:
- Embeddings with OpenAI
- Binary embeddings with Cohere
- Sentence embeddings with Candle
- Hybrid search with Candle (Reciprocal Rank Fusion)
- Recommendations with Disco
- Bulk loading with
COPY
Rust-Postgres
Add this line to your application’s Cargo.toml
under [dependencies]
:
pgvector = { version = "0.4", features = ["postgres"] }
Enable the extension
client.execute("CREATE EXTENSION IF NOT EXISTS vector", &[])?;
Create a table
client.execute("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))", &[])?;
Create a vector from a Vec<f32>
use pgvector::Vector;
let embedding = Vector::from(vec![1.0, 2.0, 3.0]);
Insert a vector
client.execute("INSERT INTO items (embedding) VALUES ($1)", &[&embedding])?;
Get the nearest neighbor
let row = client.query_one(
"SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 1",
&[&embedding],
)?;
Retrieve a vector
let row = client.query_one("SELECT embedding FROM items LIMIT 1", &[])?;
let embedding: Vector = row.get(0);
Use Option
if the value could be NULL
let embedding: Option<Vector> = row.get(0);
SQLx
Add this line to your application’s Cargo.toml
under [dependencies]
:
pgvector = { version = "0.4", features = ["sqlx"] }
For SQLx < 0.8, use version = "0.3"
and this readme.
Enable the extension
sqlx::query("CREATE EXTENSION IF NOT EXISTS vector")
.execute(&pool)
.await?;
Create a table
sqlx::query("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
.execute(&pool)
.await?;
Create a vector from a Vec<f32>
use pgvector::Vector;
let embedding = Vector::from(vec![1.0, 2.0, 3.0]);
Insert a vector
sqlx::query("INSERT INTO items (embedding) VALUES ($1)")
.bind(embedding)
.execute(&pool)
.await?;
Get the nearest neighbors
let rows = sqlx::query("SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 1")
.bind(embedding)
.fetch_all(&pool)
.await?;
Retrieve a vector
let row = sqlx::query("SELECT embedding FROM items LIMIT 1").fetch_one(&pool).await?;
let embedding: Vector = row.try_get("embedding")?;
Diesel
Add this line to your application’s Cargo.toml
under [dependencies]
:
pgvector = { version = "0.4", features = ["diesel"] }
And update your application’s diesel.toml
under [print_schema]
:
import_types = ["diesel::sql_types::*", "pgvector::sql_types::*"]
generate_missing_sql_type_definitions = false
Create a migration
diesel migration generate create_vector_extension
with up.sql
:
CREATE EXTENSION vector
and down.sql
:
DROP EXTENSION vector
Run the migration
diesel migration run
You can now use the vector
type in future migrations
CREATE TABLE items (
id SERIAL PRIMARY KEY,
embedding VECTOR(3)
)
For models, use:
use pgvector::Vector;
#[derive(Queryable)]
#[diesel(table_name = items)]
pub struct Item {
pub id: i32,
pub embedding: Option<Vector>,
}
#[derive(Insertable)]
#[diesel(table_name = items)]
pub struct NewItem {
pub embedding: Option<Vector>,
}
Create a vector from a Vec<f32>
let embedding = Vector::from(vec![1.0, 2.0, 3.0]);
Insert a vector
let new_item = NewItem {
embedding: Some(embedding)
};
diesel::insert_into(items::table)
.values(&new_item)
.get_result::<Item>(&mut conn)?;
Get the nearest neighbors
use pgvector::VectorExpressionMethods;
let neighbors = items::table
.order(items::embedding.l2_distance(embedding))
.limit(5)
.load::<Item>(&mut conn)?;
Also supports max_inner_product
, cosine_distance
, l1_distance
, hamming_distance
, and jaccard_distance
Get the distances
let distances = items::table
.select(items::embedding.l2_distance(embedding))
.load::<Option<f64>>(&mut conn)?;
Add an approximate index in a migration
CREATE INDEX my_index ON items USING hnsw (embedding vector_l2_ops)
-- or
CREATE INDEX my_index ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)
Use vector_ip_ops
for inner product and vector_cosine_ops
for cosine distance
Serialization
Use the serde
feature to enable serialization
Half Vectors
Use the halfvec
feature to enable half vectors
Reference
Convert a vector to a Vec<f32>
let f32_vec: Vec<f32> = vec.into();
Get a slice
let slice = vec.as_slice();
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/pgvector/pgvector-rust.git
cd pgvector-rust
createdb pgvector_rust_test
cargo test --all-features
Dependencies
~0–10MB
~127K SLoC