#structured-output #pydantic #instructor #llm #openai

rstructor

Rust equivalent of Python's Instructor + Pydantic: Extract structured, validated data from LLMs (OpenAI, Anthropic, Grok, Gemini) using type-safe Rust structs and enums

22 releases

Uses new Rust 2024

0.2.9 Feb 13, 2026
0.2.8 Feb 6, 2026
0.2.7 Dec 31, 2025
0.1.26 Nov 8, 2025
0.1.10 Mar 27, 2025

#43 in Machine learning

Download history 5/week @ 2025-12-30 5/week @ 2026-01-13 1/week @ 2026-01-20 32/week @ 2026-01-27 32/week @ 2026-02-03 13/week @ 2026-02-10 20/week @ 2026-02-17 3/week @ 2026-02-24 18/week @ 2026-03-03 26/week @ 2026-03-10 47/week @ 2026-03-17 38/week @ 2026-03-24 59/week @ 2026-03-31 54/week @ 2026-04-07 31/week @ 2026-04-14

206 downloads per month
Used in 3 crates

MIT license

305KB
5.5K SLoC

rstructor: Structured LLM Outputs for Rust

crates.io downloads CI Rust 2024 MIT

Extract structured, validated data from LLMs using native Rust types. Define your schema as structs/enums, and rstructor handles JSON Schema generation, API communication, parsing, and validation.

The Rust equivalent of Instructor for Python.

Features

  • Type-safe schemas — Define models as Rust structs/enums with derive macros
  • Multi-provider — OpenAI, Anthropic, Grok (xAI), and Gemini with unified API
  • Auto-validation — Type checking plus custom business rules with automatic retry
  • Complex types — Nested objects, arrays, optionals, enums with associated data
  • Extended thinking — Native support for reasoning models (GPT-5.2, Claude 4.5, Gemini 3)

Installation

[dependencies]
rstructor = "0.2"
serde = { version = "1.0", features = ["derive"] }
tokio = { version = "1.0", features = ["rt-multi-thread", "macros"] }

Quick Start

use rstructor::{Instructor, LLMClient, OpenAIClient};
use serde::{Deserialize, Serialize};

#[derive(Instructor, Serialize, Deserialize, Debug)]
struct Movie {
    #[llm(description = "Title of the movie")]
    title: String,
    #[llm(description = "Director of the movie")]
    director: String,
    #[llm(description = "Year released", example = 2010)]
    year: u16,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = OpenAIClient::from_env()?
        .temperature(0.0);

    let movie: Movie = client.materialize("Tell me about Inception").await?;
    println!("{}: {} ({})", movie.title, movie.director, movie.year);
    Ok(())
}

Providers

use rstructor::{OpenAIClient, AnthropicClient, GrokClient, GeminiClient, LLMClient};

// OpenAI (reads OPENAI_API_KEY)
let client = OpenAIClient::from_env()?.model("gpt-5.2");

// Anthropic (reads ANTHROPIC_API_KEY)
let client = AnthropicClient::from_env()?.model("claude-opus-4-6");

// Grok/xAI (reads XAI_API_KEY)
let client = GrokClient::from_env()?.model("grok-4-1-fast-non-reasoning");

// Gemini (reads GEMINI_API_KEY)
let client = GeminiClient::from_env()?.model("gemini-3-flash-preview");

// Custom endpoint (local LLMs, proxies)
let client = OpenAIClient::new("key")?
    .base_url("http://localhost:1234/v1")
    .model("llama-3.1-70b");

Validation

Add custom validation with automatic retry on failure:

use rstructor::{Instructor, RStructorError, Result};

#[derive(Instructor, Serialize, Deserialize)]
#[llm(validate = "validate_movie")]
struct Movie {
    title: String,
    year: u16,
    rating: f32,
}

fn validate_movie(movie: &Movie) -> Result<()> {
    if movie.year < 1888 || movie.year > 2030 {
        return Err(RStructorError::ValidationError(
            format!("Invalid year: {}", movie.year)
        ));
    }
    if movie.rating < 0.0 || movie.rating > 10.0 {
        return Err(RStructorError::ValidationError(
            format!("Rating must be 0-10, got {}", movie.rating)
        ));
    }
    Ok(())
}

// Retries are enabled by default (3 attempts with error feedback)
// To increase retries:
let client = OpenAIClient::from_env()?.max_retries(5);

// To disable retries:
let client = OpenAIClient::from_env()?.no_retries();

Complex Types

Nested Structures

#[derive(Instructor, Serialize, Deserialize)]
struct Ingredient {
    name: String,
    amount: f32,
    unit: String,
}

#[derive(Instructor, Serialize, Deserialize)]
struct Recipe {
    name: String,
    ingredients: Vec<Ingredient>,
    prep_time_minutes: u16,
}

Enums with Data

#[derive(Instructor, Serialize, Deserialize)]
enum PaymentMethod {
    #[llm(description = "Credit card payment")]
    Card { number: String, expiry: String },
    #[llm(description = "PayPal account")]
    PayPal(String),
    #[llm(description = "Cash on delivery")]
    CashOnDelivery,
}

Serde Rename Support

rstructor respects #[serde(rename)] and #[serde(rename_all)] attributes:

#[derive(Instructor, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
struct UserProfile {
    first_name: String,      // becomes "firstName" in schema
    last_name: String,       // becomes "lastName" in schema
    email_address: String,   // becomes "emailAddress" in schema
}

#[derive(Instructor, Serialize, Deserialize)]
struct CommitMessage {
    #[serde(rename = "type")]  // use "type" as JSON key
    commit_type: String,
    description: String,
}

#[derive(Instructor, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
enum CommitType {
    Fix,       // becomes "fix"
    Feat,      // becomes "feat"
    Refactor,  // becomes "refactor"
}

Supported case conversions: lowercase, UPPERCASE, camelCase, PascalCase, snake_case, SCREAMING_SNAKE_CASE, kebab-case, SCREAMING-KEBAB-CASE.

Custom Types (Dates, UUIDs)

use chrono::{DateTime, Utc};
use rstructor::schema::CustomTypeSchema;

impl CustomTypeSchema for DateTime<Utc> {
    fn schema_type() -> &'static str { "string" }
    fn schema_format() -> Option<&'static str> { Some("date-time") }
}

#[derive(Instructor, Serialize, Deserialize)]
struct Event {
    name: String,
    start_time: DateTime<Utc>,
}

Multimodal (Image Input)

Analyze images with structured extraction across all major providers using materialize_with_media:

use rstructor::{Instructor, LLMClient, OpenAIClient, MediaFile};

#[derive(Instructor, Serialize, Deserialize, Debug)]
struct ImageAnalysis {
    subject: String,
    summary: String,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Download or load image bytes (real-world fixture)
    let image_bytes = reqwest::get("https://example.com/image.png")
        .await?.bytes().await?;

    // Inline media is base64-encoded automatically
    let media = MediaFile::from_bytes(&image_bytes, "image/png");

    // Works with OpenAI, Anthropic, Grok, and Gemini clients
    let client = OpenAIClient::from_env()?;
    let analysis: ImageAnalysis = client
        .materialize_with_media("Describe this image", &[media])
        .await?;
    println!("{:?}", analysis);
    Ok(())
}

MediaFile::new(uri, mime_type) is also available for URL/URI-based media input.

Provider examples:

  • cargo run --example openai_multimodal_example --features openai
  • cargo run --example anthropic_multimodal_example --features anthropic
  • cargo run --example grok_multimodal_example --features grok
  • cargo run --example gemini_multimodal_example --features gemini

Extended Thinking

Configure reasoning depth for supported models:

use rstructor::ThinkingLevel;

// GPT-5.2, Claude 4.5 (Sonnet/Opus), Gemini 3
let client = OpenAIClient::from_env()?
    .model("gpt-5.2")
    .thinking_level(ThinkingLevel::High);

// Levels: Off, Minimal, Low, Medium, High

Token Usage

let result = client.materialize_with_metadata::<Movie>("...").await?;
println!("Movie: {}", result.data.title);
if let Some(usage) = result.usage {
    println!("Tokens: {} in, {} out", usage.input_tokens, usage.output_tokens);
}

Error Handling

use rstructor::{ApiErrorKind, RStructorError};

match client.materialize::<Movie>("...").await {
    Ok(movie) => println!("{:?}", movie),
    Err(e) if e.is_retryable() => {
        println!("Transient error: {}", e);
        if let Some(delay) = e.retry_delay() {
            tokio::time::sleep(delay).await;
        }
    }
    Err(e) => match e.api_error_kind() {
        Some(ApiErrorKind::RateLimited { retry_after }) => { /* ... */ }
        Some(ApiErrorKind::AuthenticationFailed) => { /* ... */ }
        _ => eprintln!("Error: {}", e),
    }
}

Feature Flags

[dependencies]
rstructor = { version = "0.2", features = ["openai", "anthropic", "grok", "gemini"] }
  • openai, anthropic, grok, gemini — Provider backends
  • derive — Derive macro (default)
  • logging — Tracing integration

Examples

See examples/ for complete working examples:

export OPENAI_API_KEY=your_key
cargo run --example structured_movie_info
cargo run --example nested_objects_example
cargo run --example enum_with_data_example
cargo run --example serde_rename_example
cargo run --example gemini_multimodal_example

For Python Developers

If you're coming from Python and searching for:

  • "pydantic rust" or "rust pydantic" — rstructor provides similar schema validation and type safety
  • "instructor rust" or "rust instructor" — same structured LLM output extraction pattern
  • "structured output rust" or "llm structured output" — exactly what rstructor does
  • "type-safe llm rust" — ensures type safety from LLM responses to Rust structs

License

MIT — see LICENSE

Dependencies

~6–23MB
~228K SLoC