19 releases

0.3.1	Apr 28, 2025
0.3.0	Mar 30, 2025
0.2.6	Feb 26, 2025
0.2.2	Dec 21, 2024
0.1.2	Nov 21, 2023

#8 in Machine learning

8,466 downloads per month
Used in 25 crates (23 directly)

Custom license and maybe GPL-3.0+

110KB
2K SLoC

Ollama-rs

A simple and easy-to-use library for interacting with the Ollama API.

This library was created following the Ollama API documentation.

Installation
Initialization
Usage

Installation

Add ollama-rs to your Cargo.toml

[dependencies]
ollama-rs = "0.3.1"

If you absolutely want the latest version, you can use the master branch by adding the following to your Cargo.toml file:

[dependencies]
ollama-rs = { git = "https://github.com/pepperoni21/ollama-rs.git", branch = "master" }

Note that the master branch may not be stable and may contain breaking changes.

Initialization

Initialize Ollama

use ollama_rs::Ollama;

// By default, it will connect to localhost:11434
let ollama = Ollama::default();

// For custom values:
let ollama = Ollama::new("http://localhost".to_string(), 11434);

Usage

Feel free to check the Chatbot example that shows how to use the library to create a simple chatbot in less than 50 lines of code. You can also check some other examples.

These examples use poor error handling for simplicity, but you should handle errors properly in your code.

Completion Generation

use ollama_rs::generation::completion::GenerationRequest;

let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();

let res = ollama.generate(GenerationRequest::new(model, prompt)).await;

if let Ok(res) = res {
    println!("{}", res.response);
}

OUTPUTS: The sky appears blue because of a phenomenon called Rayleigh scattering...

Completion Generation (Streaming)

Requires the stream feature.

use ollama_rs::generation::completion::GenerationRequest;
use tokio::io::{self, AsyncWriteExt};
use tokio_stream::StreamExt;

let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();

let mut stream = ollama.generate_stream(GenerationRequest::new(model, prompt)).await.unwrap();

let mut stdout = io::stdout();
while let Some(res) = stream.next().await {
    let responses = res.unwrap();
    for resp in responses {
        stdout.write_all(resp.response.as_bytes()).await.unwrap();
        stdout.flush().await.unwrap();
    }
}

Same output as above but streamed.

Completion Generation (With Options)

use ollama_rs::generation::completion::GenerationRequest;
use ollama_rs::models::ModelOptions;

let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();

let options = ModelOptions::default()
    .temperature(0.2)
    .repeat_penalty(1.5)
    .top_k(25)
    .top_p(0.25);

let res = ollama.generate(GenerationRequest::new(model, prompt).options(options)).await;

if let Ok(res) = res {
    println!("{}", res.response);
}

OUTPUTS: 1. Sun emits white sunlight: The sun consists primarily ...

Chat Mode

Every message sent and received will be stored in the library's history.

Example with history:

use ollama_rs::generation::chat::{ChatMessage, ChatMessageRequest};
use ollama_rs::history::ChatHistory;

let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();
// `Vec<ChatMessage>` implements `ChatHistory`,
// but you could also implement it yourself on a custom type
let mut history = vec![];

let res = ollama
    .send_chat_messages_with_history(
        &mut history, // <- messages will be saved here
        ChatMessageRequest::new(
            model,
            vec![ChatMessage::user(prompt)], // <- You should provide only one message
        ),
    )
    .await;

if let Ok(res) = res {
    println!("{}", res.message.content);
}

Check chat with history examples for default and stream

List Local Models

let res = ollama.list_local_models().await.unwrap();

Returns a vector of LocalModel structs.

Show Model Information

let res = ollama.show_model_info("llama2:latest".to_string()).await.unwrap();

Returns a ModelInfo struct.

Create a Model

use ollama_rs::models::create::CreateModelRequest;

let res = ollama.create_model(CreateModelRequest::path("model".into(), "/tmp/Modelfile.example".into())).await.unwrap();

Returns a CreateModelStatus struct representing the final status of the model creation.

Create a Model (Streaming)

Requires the stream feature.

use ollama_rs::models::create::CreateModelRequest;
use tokio_stream::StreamExt;

let mut res = ollama.create_model_stream(CreateModelRequest::path("model".into(), "/tmp/Modelfile.example".into())).await.unwrap();

while let Some(res) = res.next().await {
    let res = res.unwrap();
    // Handle the status
}

Returns a CreateModelStatusStream that will stream every status update of the model creation.

Copy a Model

let _ = ollama.copy_model("mario".into(), "mario_copy".into()).await.unwrap();

Delete a Model

let _ = ollama.delete_model("mario_copy".into()).await.unwrap();

Generate Embeddings

use ollama_rs::generation::embeddings::request::GenerateEmbeddingsRequest;

let request = GenerateEmbeddingsRequest::new("llama2:latest".to_string(), "Why is the sky blue?".into());
let res = ollama.generate_embeddings(request).await.unwrap();

Generate Embeddings (Batch)

use ollama_rs::generation::embeddings::request::GenerateEmbeddingsRequest;

let request = GenerateEmbeddingsRequest::new("llama2:latest".to_string(), vec!["Why is the sky blue?", "Why is the sky red?"].into());
let res = ollama.generate_embeddings(request).await.unwrap();

Returns a GenerateEmbeddingsResponse struct containing the embeddings (a vector of floats).

Make a Function Call

use ollama_rs::coordinator::Coordinator;
use ollama_rs::generation::chat::{ChatMessage, ChatMessageRequest};
use ollama_rs::generation::tools::implementations::{DDGSearcher, Scraper, Calculator};
use ollama_rs::models::ModelOptions;

let mut history = vec![];

let mut coordinator = Coordinator::new(ollama, "qwen2.5:32b".to_string(), history)
    .options(ModelOptions::default().num_ctx(16384))
    .add_tool(DDGSearcher::new())
    .add_tool(Scraper {})
    .add_tool(Calculator {});

let resp = coordinator
    .chat(vec![ChatMessage::user("What is the current oil price?")])
    .await.unwrap();

println!("{}", resp.message.content);

Uses the given tools (such as searching the web) to find an answer, feeds that answer back into the LLM, and returns a ChatMessageResponse with the answer to the question.

Create a custom tool

The function macro simplifies the creation of custom tools. Below is an example of a tool that retrieves the current weather for a specified city:

/// Retrieve the weather for a specified city.
///
/// * city - The city for which to get the weather.
#[ollama_rs::function]
async fn get_weather(city: String) -> Result<String, Box<dyn std::error::Error + Sync + Send>> {
    let url = format!("https://wttr.in/{city}?format=%C+%t");
    let response = reqwest::get(&url).await?.text().await?;
    Ok(response)
}

To create a custom tool, define a function that returns a Result<String, Box<dyn std::error::Error + Sync + Send>> and annotate it with the function macro. This function will be automatically converted into a tool that can be used with the Coordinator, just like any other tool.

Ensure that the doc comment above the function clearly describes the tool's purpose and its parameters. This information will be provided to the LLM to help it understand how to use the tool.

For a more detailed example, see the function call example.

Dependencies

~6–21MB
~316K SLoC