#ollama #model #generate #generate-completions #minimalist #chat #chat-completion

ollama-native

A minimalist Ollama Rust SDK that provides the most basic functionality for interacting with Ollama

3 stable releases

new 1.0.2 Mar 4, 2025
1.0.1 Mar 3, 2025
1.0.0 Mar 2, 2025

#332 in Web programming

Download history

87 downloads per month

MIT license

130KB
2.5K SLoC

ollama-native 🐑

GitHub Actions Workflow Status GitHub Release Crates.io Version GitHub License

ollama-native is a minimalist Ollama Rust SDK that provides the most basic functionality for interacting with Ollama.

Goals 🎯

  • ✅ Provide access to the core Ollama API functions for interacting with models.
  • ❌ The project does not include any business-specific functionality like chat with history.

[!TIP] For users who need features like chat with history, these functionalities can be implemented at the business layer of your application (chat-with-history-example). Alternatively, you may choose to use other Ollama SDKs that provide these higher-level features.

APIs 📝

  • Generate a completion
  • Generate a chat completion
  • Create a Model
  • List Local Models
  • Show Model Information
  • Delete a Model
  • Pull a Model
  • Push a Model
  • Generate Embeddings
  • List Running Models
  • Version
  • Check if a Blob Exists
  • Push a Blob

Features 🧬

  • Minimal Functionality: Offers the core functionalities of Ollama without extra features or complexity.
  • Rusty APIs: Utilizes chainable methods, making the API simple, concise, and idiomatic to Rust.

API Design

Completion
let options = OptionsBuilder::new()
    .stop("stop")
    .num_predict(42)
    .seed(42)
    .build();

let request = GenerateRequestBuilder::new()
    .model("llama3.1:8b")
    .prompt("Tell me a joke")
    .options(options)
    .build();

let response = ollama.generate(request).await?;
let response = ollama
    .generate("llama3.1:8b", "Tell me a joke")
    .stop("stop")
    .num_predict(42)
    .seed(42)
    .await?;
Streaming Response
let options = OptionsBuilder::new()
    .stop("stop")
    .num_predict(42)
    .seed(42)
    .build();

let request = GenerateStreamRequestBuilder::new()
    .model("llama3.1:8b")
    .prompt("Tell me a joke")
    .options(options)
    .build();

let stream = ollama.generate_stream(request).await?;
let stream = ollama
    .generate("llama3.1:8b", "Tell me a joke")
    .stop("stop")
    .num_predict(42)
    .seed(42)
    .stream() // Specify streaming response.
    .await?;

Usage 🔦

Add dependencies

default features (generate, chat, version)

cargo add ollama-native

stream features

cargo add ollama-native --features stream

model features (create models, pull models...)

cargo add ollama-native --features model

Generate a completion

use ollama_native::Ollama;

let ollama = Ollama::new("http://localhost:11434");

let response = ollama
    .generate("llama3.1:8b", "Tell me a joke about sharks")
    .seed(5)
    .temperature(3.2)
    .await?;

Generate request (Streaming)

use ollama_native::{IntoStream, Ollama};
use tokio::io::AsyncWriteExt;
use tokio_stream::StreamExt;

let ollama = Ollama::new("http://localhost:11434")

let mut stream = ollama
    .generate("llama3.1:8b", "Tell me a joke about sharks")
    .stream()
    .await?;

let mut out = tokio::io::stdout();
while let Some(Ok(item)) = stream.next().await {
    out.write(item.response.as_bytes()).await?;
    out.flush().await?;
}

Structured Ouput

[!TIP] See structured outputs example for more details.

JSON Mode

// JSON mode
let resposne = ollama
    .generate(
        "llama3.1:8b",
        "Ollama is 22 years old and is busy saving the world.",
    )
    .format("json") // Use "json" to get the response in JSON format.
    .await?;

Specified JSON Format

// Specified JSON format.
let output_format = r#"
{
    "type": "object",
    "properties": {
        "age": {
            "type": "integer"
        },
        "available": {
            "type": "boolean"
        }
    },
    "required": [
        "age",
        "available"
    ]
}"#;

let resposne = ollama
    .generate(
        "llama3.1:8b",
        "Ollama is 22 years old and is busy saving the world.",
    )
    .format(&output_format)
    .await?;

Examples 📖

License 📄

This project is licensed under the MIT license.

Acknowledgments

Thanks mongodb for providing such an elegant design pattern.

Isabel Atkinson: “Rustify Your API: A Journey from Specification to Implementation” | RustConf 2024

Dependencies

~7–18MB
~240K SLoC