#deprecated #serializable #sqlx #diesel #serde #pgvector

yanked pgvec

[DEPRECATED] serializable pgvector support for Rust

0.2.4 Oct 17, 2023
0.2.3 Oct 14, 2023
0.2.2 Oct 14, 2023

#56 in #serializable

44 downloads per month

MIT/Apache

24KB
409 lines

⚠️ Deprecated ⚠️

This repo is no longer maintained because serde feature is now added to pgvector-rust

pgvec

Serializable pgvector support for Rust

Supports Rust-Postgres, SQLx, and Diesel

Build Status

Getting Started

Follow the instructions for your database library:

Or check out some examples:

For information on serializing the vector type, see serialization.

Rust-Postgres

Add this line to your application’s Cargo.toml under [dependencies]:

pgvec = { version = "0.2", features = ["postgres"] }

Enable the extension

client.execute("CREATE EXTENSION IF NOT EXISTS vector", &[])?;

Create a table

client.execute("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))", &[])?;

Create a vector from a Vec<f32>

use pgvec::Vector;

let embedding = Vector::from(vec![1.0, 2.0, 3.0]);

Insert a vector

client.execute("INSERT INTO items (embedding) VALUES ($1)", &[&embedding])?;

Get the nearest neighbor

let row = client.query_one(
    "SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 1",
    &[&embedding],
)?;

Retrieve a vector

let row = client.query_one("SELECT embedding FROM items LIMIT 1", &[])?;
let embedding: Vector = row.get(0);

Use Option if the value could be NULL

let embedding: Option<Vector> = row.get(0);

SQLx

Add this line to your application’s Cargo.toml under [dependencies]:

pgvec = { version = "0.2", features = ["sqlx"] }

Enable the extension

sqlx::query("CREATE EXTENSION IF NOT EXISTS vector")
    .execute(&pool)
    .await?;

Create a table

sqlx::query("CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))")
    .execute(&pool)
    .await?;

Create a vector from a Vec<f32>

use pgvec::Vector;

let embedding = Vector::from(vec![1.0, 2.0, 3.0]);

Insert a vector

sqlx::query("INSERT INTO items (embedding) VALUES ($1)")
    .bind(embedding)
    .execute(&pool)
    .await?;

Get the nearest neighbors

let rows = sqlx::query("SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 1")
    .bind(embedding)
    .fetch_all(&pool)
    .await?;

Retrieve a vector

let row = sqlx::query("SELECT embedding FROM items LIMIT 1").fetch_one(&pool).await?;
let embedding: Vector = row.try_get("embedding")?;

Diesel

Add this line to your application’s Cargo.toml under [dependencies]:

pgvec = { version = "0.2", features = ["diesel"] }

And add this line to your application’s diesel.toml under [print_schema]:

import_types = ["diesel::sql_types::*", "pgvec::sql_types::*"]

Create a migration

diesel migration generate create_vector_extension

with up.sql:

CREATE EXTENSION vector

and down.sql:

DROP EXTENSION vector

Run the migration

diesel migration run

You can now use the vector type in future migrations

CREATE TABLE items (
  id SERIAL PRIMARY KEY,
  embedding VECTOR(3)
)

For models, use:

use pgvec::Vector;

#[derive(Queryable)]
#[diesel(table_name = items)]
pub struct Item {
    pub id: i32,
    pub embedding: Option<Vector>,
}

#[derive(Insertable)]
#[diesel(table_name = items)]
pub struct NewItem {
    pub embedding: Option<Vector>,
}

Create a vector from a Vec<f32>

let embedding = Vector::from(vec![1.0, 2.0, 3.0]);

Insert a vector

let new_item = NewItem {
    embedding: Some(embedding)
};

diesel::insert_into(items::table)
    .values(&new_item)
    .get_result::<Item>(&mut conn)?;

Get the nearest neighbors

use pgvec::VectorExpressionMethods;

let neighbors = items::table
    .order(items::embedding.l2_distance(embedding))
    .limit(5)
    .load::<Item>(&mut conn)?;

Also supports max_inner_product and cosine_distance

Get the distances

let distances = items::table
    .select(items::embedding.l2_distance(embedding))
    .load::<Option<f64>>(&mut conn)?;

Add an approximate index in a migration

CREATE INDEX my_index ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)
-- or
CREATE INDEX my_index ON items USING hnsw (embedding vector_l2_ops)

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

Serialization

pgvec provides serde::Serialize and serde::Deserialize implementations for Vector so that you can use it with any serde-compatible format.

To enable this feature, add this line to your application’s Cargo.toml under [dependencies]:

pgvec = { version = "0.2", features = ["serde"] }

You can then use Vector as a serializable field in your structs

#[derive(serde::Serialize, serde::Deserialize)]
struct Item {
    id: i32,
    embedding: Vector,
}

Reference

Convert a vector to a Vec<f32>

let f32_vec: Vec<f32> = vec.into();

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/appcypher/pgvec.git
cd pgvec
createdb pgvector_rust_test
cargo test --all-features

Attribution

pgvec is a fork of pgvector-rust with serde support.

Dependencies

~0–15MB
~157K SLoC