#embedding #cuda #vector #huggingface #search

candle_embed

A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face

3 releases

new 0.1.2 Apr 30, 2024
0.1.1 Apr 17, 2024
0.1.0 Apr 17, 2024

#452 in Text processing

Download history 220/week @ 2024-04-17 14/week @ 2024-04-24

234 downloads per month

MIT license

30KB
465 lines

CandleEmbed

CandleEmbed is a Rust library for creating embeddings using BERT-based models. It provides a convenient way to load pre-trained models, embed single or multiple texts, and customize the embedding process. It's basically the same code as the candle example for embeddings, but with a nice wrapper. This exists because I wanted to play with Candle, and fastembed.rs doesn't support custom models.

Features

  • Enums for most popular embedding models OR specify custom models from HF

  • Support for CUDA devices (requires feature flag)

  • Can load and unload as required for better memory management

Installation

Add the following to your Cargo.toml file:

[dependencies]
candle_embed = "0.1.0"

If you want to use CUDA devices, enable the cuda feature flag:

[dependencies]
candle_embed = { version = "0.1.0", features = ["cuda"] }

Or you can just clone the repo. It's literally just a single file.

Usage - Basics


use candle_embed::{CandleEmbedBuilder, WithModel};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a builder with default settings
    //
    let builder = CandleEmbedBuilder::new();
    

    // Build the embedder
    //
    let mut candle_embed = builder.build()?;
    

    // Embed a single text
    //
    let text = "This is a test sentence.";
    let embeddings = candle_embed.embed_one(text)?;
    
    // Embed a batch of texts
    //
    let texts = vec![
        "This is the first sentence.",
        "This is the second sentence.",
        "This is the third sentence.",
    ];
    let batch_embeddings = candle_embed.embed_batch(texts)?;
    
    // Unload the model and tokenizer, dropping them from memory
    //
    candle_embed.unload();
    
    Ok(())
}

Usage - Custom

    // ---

    let builder = CandleEmbedBuilder::new();
   
    // Embedding settings
    //
    let builder = builder
        .normalize_embeddings(true)
        .approximate_gelu(true);

    // Set model from preset
    //
    builder
        .set_model_from_presets(WithModel::UaeLargeV1);

    // Or use a custom model and revision
    //
    builder
        .custom_embedding_model("avsolatorio/GIST-small-Embedding-v0")
        .custom_model_revision("d6c4190");

    // Will use the first available CUDA device (Default)
    //
    builder.with_device_any_cuda(ordinal: usize);

    // Use a specific CUDA device failing
    //
    builder.with_device_specific_cuda(ordinal: usize);

    // Use CPU (CUDA options fail over to this)
    //
    builder.with_device_cpu();

    // Build the embedder
    //
    let mut candle_embed = builder.build()?;
    
    // This loads the model and tokenizer into memory 
    // and is ran the first time `embed` is called
    // You shouldn't need to call this
    //
    candle_embed.load();

    // Get the dimensions from the model currently loaded
    //
    let dimensions = candle_embed.dimensions;

    // ---

Feature Flags

cuda: Enables CUDA support for using GPU devices.

License

This project is licensed under the MIT License.

Contributing

My motivation for publishing is for someone to point out if I'm doing something wrong!

Dependencies

~27–43MB
~787K SLoC