#llama #inference #llm #api-bindings

curtana

Simplified zero-cost wrapper over llama.cpp powered by lama-cpp-2

4 releases

Uses new Rust 2024

0.1.2 Sep 19, 2025
0.1.1 Jul 11, 2025
0.1.0 Jul 10, 2025
0.0.2 May 27, 2025

#1387 in Text processing

Download history 26/week @ 2025-07-13 2/week @ 2025-07-20 121/week @ 2025-09-14 41/week @ 2025-09-21 27/week @ 2025-09-28 6/week @ 2025-10-05 25/week @ 2025-10-12 67/week @ 2025-10-19 44/week @ 2025-10-26

142 downloads per month

MIT license

22KB
343 lines

curtana on crates.io curtana on docs.rs curtana is MIT licensed

An accessible low-overhead wrapper over llama.cpp powered by llama-cpp-2, supporting most .gguf-formatted "Chat" and "Embedding" models.

Examples

These examples assume the following models are downloaded into the working directory:

Chat (via Llama 3.2 Instruct)

// Create a new local model registry and load
// a chat model into it with a system prompt
// of "You are a cupcake."
let registry = ModelRegistry::new().unwrap();
let mut model = registry
    .load_chat_model("Llama-3.2-3B-Instruct-Q6_K.gguf", "You are a cupcake.")
    .unwrap();

// Run ("infer") the model with the prompt
// "What are you?", capturing its output
// as UTF-8 encoded bytes.
let mut output = vec![];
model.infer("What are you?", &mut output).unwrap();
let output = String::from_utf8_lossy(&output);

// Hopefully, the model thinks it's a cupcake due
// to the system prompt.
assert!(output.to_lowercase().contains("cupcake"));

Embedding (via Nomic Embedding 1.5)

// Create a new local model registry and load
// an embedding model into it.
let registry = ModelRegistry::new().unwrap();
let mut model = registry
    .load_text_embedding_model("nomic-embed-text-v1.5.f16.gguf")
    .unwrap();

// Embed some fanciful document titles with the model.
let embeddings = model
    .embed(&[
        "search_document: might and magic in fantasy realms",
        "search_document: swords and sorcery for fantasy authors",
        "search_document: practical engineering for scientists",
    ])
    .unwrap();
assert_eq!(3, embeddings.len());

// Embed a search query with the model.
let query_embeddings = model.embed(&["query_document: fantasy"]).unwrap();
assert_eq!(1, query_embeddings.len());

// Calculate the cosine distance (or "similarity") between the embeddings.
let distance_a = cosine_distance(&query_embeddings[0], &embeddings[0]);
let distance_b = cosine_distance(&query_embeddings[0], &embeddings[1]);
let distance_c = cosine_distance(&query_embeddings[0], &embeddings[2]);

// The fantasy embeddings should be more similar
// than the scientific embedding.
assert!(distance_a < distance_c);
assert!(distance_b < distance_c);

License

Copyright © 2025 With Caer, LLC.

Licensed under the MIT license. Refer to the license file for more info.

Dependencies

~13MB
~247K SLoC