1 unstable release
Uses new Rust 2024
| 0.1.1 | Sep 30, 2025 |
|---|
#627 in Machine learning
105KB
2.5K
SLoC
๐ง Tllama
๐ Lightweight Local LLM Inference Engine
Tllama is a Rust-based open-source LLM engine designed for efficient local execution. It features a command-line interface and OpenAI-compatible API for seamless model interaction.
๐ Key Features
- ๐ Smart model detection
- ๐ค Full OpenAI API compatibility
- โก Blazing-fast startup (<0.5s)
- ๐ฆ Ultra-compact binary (<20MB)
๐ฆ Installation
Script install
curl -sSL https://raw.githubusercontent.com/moyanj/tllama/main/install.sh | bash
Cargo install
cargo install tllama
Pre-built binaries
Download from Releases
๐งช Usage Guide
Discover models
tllama discover [--all]
Text generation
tllama infer <model_path> "<prompt>" \
--n-len <tokens> \ # Output length (default: 128)
--temperature <value> \ # Randomness (0-1)
--top-k <value> \ # Top-k sampling
--repeat-penalty <value> # Repetition penalty
Example:
tllama infer ./llama3-8b.gguf "The future of AI is" \
--temperature 0.7 \
--n-len 256
Interactive chat
tllama chat <model_path>
Start API server
tllama serve \
--host 0.0.0.0 \ # Bind address (default)
--port 8080 # Port (default)
Chat API Example:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3-8b",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Explain Rust's memory safety"}
],
"temperature": 0.7,
"max_tokens": 200
}'
๐ Development Roadmap
- Core CLI implementation
- GGUF quantized model support
- Model auto-download & caching
- Web UI integration
- Comprehensive test suite
๐ Contributing
PRs welcome! See CONTRIBUTING.md for guidelines
๐ License
MIT License
โจ Design Philosophy
Terminal-first: Optimized for CLI workflows with 10x faster startup than Ollama Minimal footprint: Single binary under 5MB, zero external dependencies Seamless integration: Compatible with OpenAI SDKs and LangChain
๐ฌ Contact
- GitHub: moyanj/tllama
- Issues: Report bugs
- Feature requests: Open discussion issue
โญ Star us on GitHub to show your support!
Dependencies
~14โ37MB
~535K SLoC