3 stable releases
Uses new Rust 2024
1.0.2 | Mar 4, 2025 |
---|---|
1.0.1 | Mar 3, 2025 |
1.0.0 | Mar 2, 2025 |
#96 in Machine learning
409 downloads per month
145KB
2.5K
SLoC
ollama-native 🐑
ollama-native is a minimalist Ollama Rust SDK that provides the most basic functionality for interacting with Ollama.
Goals 🎯
- ✅ Provide access to the core Ollama API functions for interacting with models.
- ❌ The project does not include any business-specific functionality like chat with history.
[!TIP] For users who need features like chat with history, these functionalities can be implemented at the business layer of your application (chat-with-history-example). Alternatively, you may choose to use other Ollama SDKs that provide these higher-level features.
Usage 🔦
Add dependencies
cargo add ollama-native
Generate a Completion
use ollama_native::Ollama;
let ollama = Ollama::new("http://localhost:11434");
let response = ollama
.generate("llama3.1:8b")
.prompt("Tell me a joke about sharks")
.seed(5)
.temperature(3.2)
.await?;
Generate Request (Streaming)
Add stream
feature:
cargo add ollama-native --features stream
use ollama_native::{Ollama, action::IntoStream};
use tokio::io::AsyncWriteExt;
use tokio_stream::StreamExt;
let ollama = Ollama::new("http://localhost:11434")
let mut stream = ollama
.generate("llama3.1:8b")
.prompt("Tell me a joke about sharks")
.stream()
.await?;
let mut out = tokio::io::stdout();
while let Some(Ok(item)) = stream.next().await {
out.write(item.response.as_bytes()).await?;
out.flush().await?;
}
Structured Ouput
[!TIP] See structured outputs example for more details.
JSON Mode
// JSON mode
let resposne = ollama
.generate("llama3.1:8b")
.prompt("Ollama is 22 years old and is busy saving the world.")
.json() // Get the response in JSON format.
.await?;
Specified JSON Format
let format = r#"
{
"type": "object",
"properties": {
"age": {
"type": "integer"
},
"available": {
"type": "boolean"
}
},
"required": [
"age",
"available"
]
}"#;
let resposne = ollama
.generate("llama3.1:8b")
.prompt("Ollama is 22 years old and is busy saving the world.")
.format(format)
.await?;
API Design 🧬
- Minimal Functionality: Offers the core functionalities of Ollama without extra features or complexity.
- Rusty Style: Utilizes chainable methods, making the API simple, concise, and idiomatic to Rust.
- Fluent Response: Responses are automatically converted to the appropriate data structure based on the methods you call.
- Unified APIs: Uses a consistent API for both streaming and non-streaming requests.
❌ | ✅ | |
---|---|---|
Chaining Methods |
|
|
Fluent Response |
|
|
Unified APIs |
|
|
APIs 📝
- Generate a completion
- Generate a chat completion
- Create a Model
- List Local Models
- Show Model Information
- Delete a Model
- Pull a Model
- Push a Model
- Generate Embeddings
- List Running Models
- Version
- Check if a Blob Exists
- Push a Blob
Examples 📖
- Generate Completions
- Generate Chat Completions (Streaming)
- Generate Chat Completions with Images
- Generate Embeddings
- Structured Outputs (JSON)
- Chat with History
- Load a Model into Memory
License ⚖️
This project is licensed under the MIT license.
Acknowledgments 🎉
Thanks mongodb for providing such an elegant design pattern.
Isabel Atkinson: “Rustify Your API: A Journey from Specification to Implementation” | RustConf 2024
Dependencies
~6–18MB
~232K SLoC