15 unstable releases (3 breaking)
Uses new Rust 2024
| new 0.4.0 | Mar 16, 2026 |
|---|---|
| 0.3.2 | Feb 21, 2026 |
| 0.2.1 | Jan 22, 2026 |
| 0.1.9 | Jan 3, 2026 |
| 0.1.1 | Nov 30, 2025 |
#387 in Asynchronous
563 downloads per month
Used in 10 crates
(7 directly)
1MB
12K
SLoC
adk-model
LLM model integrations for Rust Agent Development Kit (ADK-Rust) with Gemini, OpenAI, xAI, Anthropic, DeepSeek, Groq, Ollama, Fireworks AI, Together AI, Mistral AI, Perplexity, Cerebras, SambaNova, Amazon Bedrock, and Azure AI Inference.
Overview
adk-model provides LLM integrations for the Rust Agent Development Kit (ADK-Rust). Supports all major providers:
- Gemini - Google's Gemini models (3 Pro, 3 Flash, 2.5 Pro, 2.5 Flash, etc.)
- OpenAI - GPT-5.1, GPT-5, GPT-5 Mini, GPT-4o (legacy)
- xAI - Grok models through the OpenAI-compatible API
- Anthropic - Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, Claude 4
- DeepSeek - DeepSeek R1, DeepSeek V3.1, DeepSeek-Chat with thinking mode
- Groq - Ultra-fast inference (LLaMA 3.3, Mixtral, Gemma)
- Ollama - Local LLMs (LLaMA, Mistral, Qwen, Gemma, etc.)
- Fireworks AI - Fast open-model inference (Llama, Mixtral, etc.)
- Together AI - Hosted open models (Llama, CodeLlama, etc.)
- Mistral AI - Mistral cloud models (Mistral Small, Large, etc.)
- Perplexity - Search-augmented LLM (Sonar, etc.)
- Cerebras - Ultra-fast inference (Llama 3.3, etc.)
- SambaNova - Fast inference (Llama 3.3, etc.)
- Amazon Bedrock - AWS-hosted models via IAM auth (Claude, Llama, Mistral, etc.)
- Azure AI Inference - Azure-hosted models (Cohere, Llama, Mistral, etc.)
- Streaming - Real-time response streaming for all providers
- Multimodal - Text, images, audio, video, and PDF input
The crate implements the Llm trait from adk-core, allowing models to be used interchangeably.
Installation
[dependencies]
adk-model = "0.4"
Or use the meta-crate:
[dependencies]
adk-rust = { version = "0.4", features = ["models"] }
Quick Start
Gemini (Google)
use adk_model::GeminiModel;
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("GOOGLE_API_KEY")?;
let model = GeminiModel::new(&api_key, "gemini-3.1-pro-preview")?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
OpenAI
use adk_model::openai::{OpenAIClient, OpenAIConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("OPENAI_API_KEY")?;
let model = OpenAIClient::new(OpenAIConfig::new(api_key, "gpt-5-mini"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
OpenAI Reasoning Effort
For reasoning models (o1, o3, etc.), control how much reasoning effort the model applies:
use adk_model::openai::{OpenAIClient, OpenAIConfig, ReasoningEffort};
let config = OpenAIConfig::new(api_key, "o3-mini")
.with_reasoning_effort(ReasoningEffort::High);
let model = OpenAIClient::new(config)?;
Available levels: Low, Medium, High.
Anthropic (Claude)
use adk_model::anthropic::{AnthropicClient, AnthropicConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("ANTHROPIC_API_KEY")?;
let model = AnthropicClient::new(AnthropicConfig::new(api_key, "claude-sonnet-4-5-20250929"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Anthropic Advanced Features
use adk_model::anthropic::{AnthropicClient, AnthropicConfig};
// Extended thinking with token budget
let config = AnthropicConfig::new(api_key, "claude-sonnet-4-5-20250929")
.with_thinking(8192)
.with_prompt_caching(true)
.with_beta_feature("prompt-caching-2024-07-31");
let client = AnthropicClient::new(config)?;
// Token counting
let count = client.count_tokens(&request).await?;
// Model discovery
let models = client.list_models().await?;
let info = client.get_model("claude-sonnet-4-5-20250929").await?;
// Rate limit inspection
let rate_info = client.latest_rate_limit_info().await;
DeepSeek
use adk_model::deepseek::{DeepSeekClient, DeepSeekConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("DEEPSEEK_API_KEY")?;
// Standard chat model
let model = DeepSeekClient::chat(api_key)?;
// Or use the reasoner model with chain-of-thought
// let model = DeepSeekClient::reasoner(api_key)?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Groq (Ultra-Fast)
use adk_model::groq::{GroqClient, GroqConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("GROQ_API_KEY")?;
let model = GroqClient::new(GroqConfig::llama70b(api_key))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Ollama (Local)
use adk_model::ollama::{OllamaModel, OllamaConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Requires: ollama serve && ollama pull llama3.2
let model = OllamaModel::new(OllamaConfig::new("llama3.2"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Fireworks AI
use adk_model::fireworks::{FireworksClient, FireworksConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("FIREWORKS_API_KEY")?;
let model = FireworksClient::new(FireworksConfig::new(
api_key, "accounts/fireworks/models/llama-v3p1-8b-instruct",
))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Together AI
use adk_model::together::{TogetherClient, TogetherConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("TOGETHER_API_KEY")?;
let model = TogetherClient::new(TogetherConfig::new(
api_key, "meta-llama/Llama-3.3-70B-Instruct-Turbo",
))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Mistral AI
use adk_model::mistral::{MistralClient, MistralConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("MISTRAL_API_KEY")?;
let model = MistralClient::new(MistralConfig::new(api_key, "mistral-small-latest"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Perplexity
use adk_model::perplexity::{PerplexityClient, PerplexityConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("PERPLEXITY_API_KEY")?;
let model = PerplexityClient::new(PerplexityConfig::new(api_key, "sonar"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Cerebras
use adk_model::cerebras::{CerebrasClient, CerebrasConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("CEREBRAS_API_KEY")?;
let model = CerebrasClient::new(CerebrasConfig::new(api_key, "llama-3.3-70b"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
SambaNova
use adk_model::sambanova::{SambaNovaClient, SambaNovaConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("SAMBANOVA_API_KEY")?;
let model = SambaNovaClient::new(SambaNovaConfig::new(api_key, "Meta-Llama-3.3-70B-Instruct"))?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Amazon Bedrock
use adk_model::bedrock::{BedrockClient, BedrockConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Uses AWS IAM credentials from the environment (no API key needed)
let config = BedrockConfig::new("us-east-1", "anthropic.claude-sonnet-4-20250514-v1:0");
let model = BedrockClient::new(config).await?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Azure AI Inference
use adk_model::azure_ai::{AzureAIClient, AzureAIConfig};
use adk_agent::LlmAgentBuilder;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("AZURE_AI_API_KEY")?;
let config = AzureAIConfig::new(
"https://my-endpoint.eastus.inference.ai.azure.com",
api_key,
"meta-llama-3.1-8b-instruct",
);
let model = AzureAIClient::new(config)?;
let agent = LlmAgentBuilder::new("assistant")
.model(Arc::new(model))
.build()?;
Ok(())
}
Supported Models
Google Gemini
| Model | Description |
|---|---|
gemini-3.1-pro |
Most intelligent AI model, enhancing reasoning and multimodal capabilities. (1M context) |
gemini-3-pro |
Intelligent model for complex agentic workflows (1M context) |
gemini-3-flash |
Fast and efficient for most tasks (1M context) |
gemini-2.5-pro |
Advanced reasoning and multimodal understanding |
gemini-2.5-flash |
Balanced speed and capability (recommended) |
gemini-2.5-flash-lite |
Ultra-fast for high-volume tasks |
gemini-2.0-flash |
Previous generation (retiring March 2026) |
See Gemini models documentation for the full list.
OpenAI
| Model | Description |
|---|---|
gpt-5.1 |
Latest iteration with improved performance (256K context) |
gpt-5 |
State-of-the-art unified model with adaptive thinking |
gpt-5-mini |
Efficient version for most tasks (128K context) |
gpt-4o |
Multimodal model (deprecated August 2025) |
gpt-4o-mini |
Fast and affordable (deprecated August 2025) |
See OpenAI models documentation for the full list.
Anthropic Claude
| Model | Description |
|---|---|
claude-opus-4-5-20251101 |
Most capable model for complex autonomous tasks (200K context) |
claude-sonnet-4-5-20250929 |
Balanced intelligence and cost for production (1M context) |
claude-haiku-4-5-20251001 |
Ultra-efficient for high-volume workloads |
claude-opus-4-20250514 |
Hybrid model with extended thinking |
claude-sonnet-4-20250514 |
Balanced model with extended thinking |
See Anthropic models documentation for the full list.
DeepSeek
| Model | Description |
|---|---|
deepseek-r1-0528 |
Latest reasoning model with enhanced thinking depth (128K context) |
deepseek-r1 |
Advanced reasoning comparable to o1 |
deepseek-v3.1 |
Latest 671B MoE model for general tasks |
deepseek-chat |
671B MoE model, excellent for code (V3) |
deepseek-vl2 |
Vision-language model (32K context) |
Features:
- Thinking Mode - Chain-of-thought reasoning with
<thinking>tags - Context Caching - Automatic KV cache for repeated prefixes (10x cost reduction)
- Tool Calling - Full function calling support
See DeepSeek API documentation for the full list.
Groq
| Model | Description |
|---|---|
llama-4-scout |
Llama 4 Scout (17Bx16E) - Fast via Groq LPU (128K context) |
llama-3.2-90b-text-preview |
Large text model |
llama-3.2-11b-text-preview |
Balanced text model |
llama-3.1-70b-versatile |
Versatile large model |
llama-3.1-8b-instant |
Ultra-fast instruction model |
mixtral-8x7b-32768 |
MoE model with 32K context |
Features:
- Ultra-Fast - LPU-based inference (fastest in the industry)
- Tool Calling - Full function calling support
- Large Context - Up to 128K tokens
See Groq documentation for the full list.
Ollama (Local)
| Model | Description |
|---|---|
llama3.3:70b |
Llama 3.3 70B - Latest for local deployment (128K context) |
llama3.2:3b |
Efficient small model |
llama3.1:8b |
Popular balanced model |
deepseek-r1:14b |
Distilled reasoning model |
deepseek-r1:32b |
Larger distilled reasoning model |
qwen3:14b |
Strong multilingual and coding |
qwen2.5:7b |
Efficient multilingual model (recommended for tool calling) |
mistral:7b |
Fast and capable |
mistral-nemo:12b |
Enhanced Mistral variant (128K context) |
gemma3:9b |
Google's efficient open model |
devstral:24b |
Optimized for coding tasks |
codellama:13b |
Code-focused Llama variant |
Features:
- Local Inference - No API key required
- Privacy - Data stays on your machine
- Tool Calling - Full function calling support (uses non-streaming for reliability)
- MCP Integration - Connect to MCP servers for external tools
See Ollama library for all available models.
New Providers
| Provider | Feature Flag | Default Model | API Key Env Var |
|---|---|---|---|
| Fireworks AI | fireworks |
accounts/fireworks/models/llama-v3p1-8b-instruct |
FIREWORKS_API_KEY |
| Together AI | together |
meta-llama/Llama-3.3-70B-Instruct-Turbo |
TOGETHER_API_KEY |
| Mistral AI | mistral |
mistral-small-latest |
MISTRAL_API_KEY |
| Perplexity | perplexity |
sonar |
PERPLEXITY_API_KEY |
| Cerebras | cerebras |
llama-3.3-70b |
CEREBRAS_API_KEY |
| SambaNova | sambanova |
Meta-Llama-3.3-70B-Instruct |
SAMBANOVA_API_KEY |
| Amazon Bedrock | bedrock |
anthropic.claude-sonnet-4-20250514-v1:0 |
AWS IAM credentials |
| Azure AI Inference | azure-ai |
(endpoint-specific) | AZURE_AI_API_KEY |
Features
- Streaming - Real-time response streaming for all providers
- Tool Calling - Function calling support across all providers
- Async - Full async/await support with backpressure
- Retry - Automatic retry with exponential backoff
- Generation Config - Temperature, top_p, top_k, max_tokens
Environment Variables
# Google Gemini
GOOGLE_API_KEY=your-google-api-key
# OpenAI
OPENAI_API_KEY=your-openai-api-key
# xAI
XAI_API_KEY=your-xai-api-key
# Anthropic
ANTHROPIC_API_KEY=your-anthropic-api-key
# DeepSeek
DEEPSEEK_API_KEY=your-deepseek-api-key
# Groq
GROQ_API_KEY=your-groq-api-key
# Fireworks AI
FIREWORKS_API_KEY=your-fireworks-api-key
# Together AI
TOGETHER_API_KEY=your-together-api-key
# Mistral AI
MISTRAL_API_KEY=your-mistral-api-key
# Perplexity
PERPLEXITY_API_KEY=your-perplexity-api-key
# Cerebras
CEREBRAS_API_KEY=your-cerebras-api-key
# SambaNova
SAMBANOVA_API_KEY=your-sambanova-api-key
# Azure AI Inference
AZURE_AI_API_KEY=your-azure-ai-api-key
# Amazon Bedrock (uses AWS IAM credentials)
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1
# Ollama (no key needed, just start the server)
# ollama serve
Feature Flags
Enable specific providers with feature flags:
[dependencies]
# All providers (default)
adk-model = { version = "0.4", features = ["all-providers"] }
# Individual providers
adk-model = { version = "0.4", features = ["gemini"] }
adk-model = { version = "0.4", features = ["openai"] }
adk-model = { version = "0.4", features = ["xai"] }
adk-model = { version = "0.4", features = ["anthropic"] }
adk-model = { version = "0.4", features = ["deepseek"] }
adk-model = { version = "0.4", features = ["groq"] }
adk-model = { version = "0.4", features = ["ollama"] }
adk-model = { version = "0.4", features = ["fireworks"] }
adk-model = { version = "0.4", features = ["together"] }
adk-model = { version = "0.4", features = ["mistral"] }
adk-model = { version = "0.4", features = ["perplexity"] }
adk-model = { version = "0.4", features = ["cerebras"] }
adk-model = { version = "0.4", features = ["sambanova"] }
adk-model = { version = "0.4", features = ["bedrock"] }
adk-model = { version = "0.4", features = ["azure-ai"] }
Related Crates
- adk-rust - Meta-crate with all components
- adk-core - Core
Llmtrait - adk-agent - Agent implementations
License
Apache-2.0
Part of ADK-Rust
This crate is part of the ADK-Rust framework for building AI agents in Rust.
Dependencies
~17–57MB
~1M SLoC