29 releases (5 breaking)
| new 0.6.5 | Feb 15, 2026 |
|---|---|
| 0.6.2 | Jan 30, 2026 |
| 0.2.0 | Dec 17, 2025 |
| 0.1.1 | Nov 28, 2025 |
#659 in Command line utilities
5.5MB
126K
SLoC
batuta
Orchestration framework for the Sovereign AI Stack — privacy-preserving ML infrastructure in pure Rust
Table of Contents
Overview
Batuta coordinates the Sovereign AI Stack, a comprehensive pure-Rust ecosystem for organizations requiring complete control over their ML infrastructure. The stack enables privacy-preserving inference, model management, and data processing without external cloud dependencies.
Key Capabilities
- Privacy Tiers: Sovereign (local-only), Private (VPC), Standard (cloud-enabled)
- Model Security: Ed25519 signatures, ChaCha20-Poly1305 encryption, BLAKE3 content addressing
- API Compatibility: OpenAI-compatible endpoints for drop-in replacement
- Observability: Prometheus metrics, distributed tracing, A/B testing
- Cost Control: Circuit breakers with configurable daily budgets
Installation
cargo install batuta
Or add to your Cargo.toml:
[dependencies]
batuta = "0.4"
Quick Start
# Analyze project structure and dependencies
batuta analyze --languages --dependencies --tdg
# Query the Sovereign AI Stack
batuta oracle "How do I serve a Llama model locally?"
# Model registry operations
batuta pacha pull llama3-8b-q4
batuta pacha sign model.gguf --identity alice@example.com
batuta pacha verify model.gguf
# Encrypt models for distribution
batuta pacha encrypt model.gguf --password-env MODEL_KEY
batuta pacha decrypt model.gguf.enc --password-env MODEL_KEY
Usage
Project Analysis
# Full project analysis with TDG scoring
batuta analyze --languages --dependencies --tdg .
# Language detection only
batuta analyze --languages .
# Output formats: text (default), json, markdown
batuta analyze --format json .
Oracle Queries
# Natural language queries about the Sovereign AI Stack
batuta oracle "How do I train a random forest model?"
# RAG-based documentation search (requires indexing first)
batuta oracle --rag-index # Index stack documentation
batuta oracle --rag "tokenization" # Search indexed docs
# Interactive oracle mode
batuta oracle --interactive
Stack Management
# Check stack component versions
batuta stack versionshttps://www.coursera.org/specializations/hugging-face-ai-development
# Quality matrix for all components
batuta stack quality
# Dependency health check
batuta stack check
Demo
Live Demo: paiml.github.io/batuta | API Docs
Example Output (batuta analyze --tdg):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Technical Debt Gradient Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Project: my-project
Language: Rust (confidence: 98%)
Metrics:
Cyclomatic Complexity: 4.2 avg (good)
Test Coverage: 87% (A-)
Documentation: 92% (A)
Dependency Health: 95% (A+)
TDG Score: 91.5/100 (A)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Stack Components
Batuta orchestrates a layered architecture of pure-Rust components:
┌─────────────────────────────────────────────────────────────┐
│ batuta v0.4.8 │
│ (Orchestration Layer) │
├─────────────────────────────────────────────────────────────┤
│ realizar v0.5 │ pacha v0.2 │
│ (Inference Engine) │ (Model Registry) │
├──────────────────────────┴──────────────────────────────────┤
│ aprender v0.24 │ entrenar v0.5 │ alimentar v0.2 │
│ (ML Algorithms) │ (Training) │ (Data Loading) │
├─────────────────────────────────────────────────────────────┤
│ trueno v0.11 │ repartir v2.0 │ renacer v0.9 │
│ (SIMD/GPU Compute) │ (Distributed) │ (Syscall Tracing) │
└─────────────────────────────────────────────────────────────┘
Core Components
| Component | Version | Description |
|---|---|---|
| trueno | 0.11 | SIMD/GPU compute primitives (AVX2/AVX-512/NEON, wgpu) |
| aprender | 0.24 | ML algorithms: regression, trees, clustering, NAS |
| entrenar | 0.5 | Training: autograd, LoRA/QLoRA, quantization |
| realizar | 0.5 | Inference engine for GGUF/SafeTensors models |
| pacha | 0.2 | Model registry with signatures, encryption, lineage |
| repartir | 2.0 | Distributed compute (CPU/GPU/Remote executors) |
| renacer | 0.9 | Syscall tracing with semantic validation |
| batuta | 0.4 | Stack orchestration, drift detection, CLI |
Extended Ecosystem
| Component | Version | Description |
|---|---|---|
| trueno-db | 0.3 | GPU-accelerated analytics database |
| trueno-graph | 0.1 | Graph database for code analysis |
| trueno-rag | 0.1 | RAG pipeline (chunking, BM25+vector, RRF) |
| trueno-viz | 0.1 | Terminal/PNG visualization |
| alimentar | 0.2 | Zero-copy Parquet/Arrow data loading |
| whisper-apr | 0.1 | Pure Rust Whisper ASR (WASM-first) |
| jugar | 0.1 | Game engine (ECS, physics, AI, WASM) |
| simular | 0.3 | Simulation engine (Monte Carlo, physics) |
| bashrs | 6.53 | Shell-to-Rust transpiler and linter |
| presentar | 0.3 | Terminal presentation framework |
| pmat | 2.213 | Project quality analysis toolkit |
Commands
batuta analyze
Analyze project structure, languages, and dependencies:
batuta analyze --languages --dependencies --tdg
# Output:
# Primary language: Python
# Dependencies: pip (42 packages), ML frameworks detected
# TDG Score: 73.2/100 (B)
# Recommended: Use Aprender for ML, Realizar for inference
batuta oracle
Query the stack for component recommendations:
# Natural language queries
batuta oracle "Train random forest on 1M samples"
# List all components
batuta oracle --list
# Component details
batuta oracle --show realizar
# Interactive mode
batuta oracle --interactive
batuta pacha
Model registry operations:
# Pull models from registry
batuta pacha pull llama3-8b-q4
# Generate signing keys
batuta pacha keygen --identity alice@example.com
# Sign models for distribution
batuta pacha sign model.gguf --identity alice@example.com
# Verify model signatures
batuta pacha verify model.gguf
# Encrypt models at rest
batuta pacha encrypt model.gguf --password-env MODEL_KEY
# Decrypt for inference
batuta pacha decrypt model.gguf.enc --password-env MODEL_KEY
batuta content
Generate structured content with quality constraints:
# Available content types
batuta content types
# Generate book chapter prompt
batuta content emit --type bch --title "Error Handling" --audience "developers"
# Validate content quality
batuta content validate --type bch chapter.md
batuta stack
Manage the Sovereign AI Stack ecosystem:
# Check stack component versions
batuta stack versions
# Detect version drift across published crates
batuta stack drift
# Generate fix commands for drift issues
batuta stack drift --fix --workspace ~/src
# Check which crates need publishing
batuta stack publish-status
# Quality gate for CI/pre-commit
batuta stack gate
Automatic Drift Detection: Batuta blocks all commands if published stack crates
are using outdated versions of other stack crates. Use --unsafe-skip-drift-check
to bypass in emergencies.
Privacy Tiers
The stack enforces data sovereignty through configurable privacy tiers:
| Tier | Behavior | Use Case |
|---|---|---|
| Sovereign | Blocks ALL external API calls | Healthcare, Government |
| Private | VPC/dedicated endpoints only | Financial services |
| Standard | Public APIs allowed | General deployment |
use batuta::serve::{BackendSelector, PrivacyTier};
let selector = BackendSelector::new()
.with_privacy(PrivacyTier::Sovereign);
// Returns only local backends: Realizar, Ollama, LlamaCpp
let backends = selector.recommend();
Model Security
Digital Signatures (Ed25519)
Verify model integrity before loading:
use pacha::signing::{SigningKey, sign_model, verify_model};
let signing_key = SigningKey::generate();
let signature = sign_model(&model_data, &signing_key)?;
// Verification fails if model tampered
verify_model(&model_data, &signature)?;
Encryption at Rest (ChaCha20-Poly1305)
Protect models during distribution:
use pacha::crypto::{encrypt_model, decrypt_model};
let encrypted = encrypt_model(&model_data, "password")?;
let decrypted = decrypt_model(&encrypted, "password")?;
Documentation
- The Batuta Book — Comprehensive guide
- Sovereign AI Stack Book — Complete stack tutorial with 22 chapters
- API Documentation — Rust API reference
- Specifications — Technical specifications
Design Principles
Batuta applies Toyota Production System principles:
| Principle | Application |
|---|---|
| Jidoka | Automatic failover with context preservation |
| Poka-Yoke | Privacy tiers prevent data leakage |
| Heijunka | Spillover routing for load leveling |
| Muda | Cost circuit breakers prevent waste |
| Kaizen | Continuous metrics and optimization |
Development
# Clone repository
git clone https://github.com/paiml/batuta.git
cd batuta
# Build
cargo build --release
# Run tests
cargo test
# Build documentation
mdbook build book
Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository and create your branch from
main - Run tests before submitting:
cargo test --all-features - Run lints:
cargo clippy --all-targets --all-features -- -D warnings - Format code:
cargo fmt --all - Update documentation for any API changes
- Submit a pull request with a clear description
See our CI workflow for the full test suite.
License
MIT License — see LICENSE for details.
Links
- crates.io/crates/batuta
- GitHub Repository
- Documentation Book
- Sovereign AI Stack Specification
- 🤖 Coursera Hugging Face AI Development Specialization - Build Production AI systems with Hugging Face in Pure Rust
Batuta — Orchestrating sovereign AI infrastructure.
Dependencies
~2–86MB
~1.5M SLoC