3 releases
| 0.1.6 | Aug 23, 2025 |
|---|---|
| 0.1.5 | Aug 23, 2025 |
| 0.1.4 | Aug 23, 2025 |
| 0.1.3 |
|
| 0.1.2 |
|
#1334 in HTTP server
1MB
21K
SLoC
Ultrafast Gateway
A high-performance AI gateway built in Rust that provides a unified interface to 10+ LLM providers with advanced routing, caching, and monitoring capabilities.
๐ Features
Core Capabilities
- Multi-Provider Support: OpenAI, Anthropic, Google, Groq, Mistral, Cohere, Perplexity, Ollama, and more
- Intelligent Routing: Automatic provider selection and load balancing
- Advanced Caching: Built-in response caching with TTL and invalidation
- Circuit Breakers: Automatic failover and recovery mechanisms
- Rate Limiting: Per-user and per-provider rate limiting
- Real-time Monitoring: Live metrics, analytics, and health checks
Performance Features
- High Throughput: Built with Rust for maximum performance
- Async Processing: Non-blocking I/O with Tokio runtime
- Connection Pooling: Efficient HTTP client management
- Response Streaming: Real-time streaming support
- Memory Optimization: Minimal memory footprint
Enterprise Features
- API Key Management: Virtual API keys with rate limiting
- JWT Authentication: Stateless token-based authentication
- Request Validation: Comprehensive input sanitization
- Content Filtering: Plugin-based content moderation
- Audit Logging: Complete request/response logging
๐ฆ Installation
From Crates.io
cargo add ultrafast-gateway
From Source
git clone https://github.com/techgopal/ultrafast-ai-gateway.git
cd ultrafast-ai-gateway/ultrafast-gateway
cargo build --release
๐ Quick Start
Basic Usage
use ultrafast_gateway::{Gateway, GatewayConfig, ProviderConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create gateway configuration
let config = GatewayConfig::default()
.with_provider(ProviderConfig::openai("your-openai-key"))
.with_provider(ProviderConfig::anthropic("your-anthropic-key"))
.with_cache_enabled(true)
.with_rate_limiting(true);
// Initialize gateway
let gateway = Gateway::new(config).await?;
// Start the server
gateway.serve("127.0.0.1:3000").await?;
Ok(())
}
Advanced Configuration
use ultrafast_gateway::{
Gateway, GatewayConfig, ProviderConfig,
CacheConfig, RateLimitConfig, CircuitBreakerConfig
};
let config = GatewayConfig::default()
.with_provider(ProviderConfig::openai("sk-...")
.with_circuit_breaker(CircuitBreakerConfig {
failure_threshold: 5,
recovery_timeout: Duration::from_secs(60),
request_timeout: Duration::from_secs(30),
}))
.with_cache(CacheConfig {
ttl: Duration::from_secs(3600),
max_size: 10000,
eviction_policy: EvictionPolicy::LRU,
})
.with_rate_limiting(RateLimitConfig {
requests_per_minute: 100,
burst_size: 20,
per_user: true,
})
.with_authentication(true)
.with_monitoring(true);
๐ง Configuration
Configuration File (config.toml)
[server]
host = "0.0.0.0"
port = 3000
workers = 4
[providers.openai]
api_key = "your-openai-key"
base_url = "https://api.openai.com/v1"
timeout = 30
max_retries = 3
[providers.anthropic]
api_key = "your-anthropic-key"
base_url = "https://api.anthropic.com"
timeout = 30
max_retries = 3
[cache]
enabled = true
ttl = 3600
max_size = 10000
eviction_policy = "lru"
[rate_limiting]
enabled = true
requests_per_minute = 100
burst_size = 20
per_user = true
[authentication]
enabled = true
jwt_secret = "your-jwt-secret"
api_key_header = "X-API-Key"
[monitoring]
enabled = true
metrics_port = 9090
health_check_interval = 30
Environment Variables
export ULTRAFAST_GATEWAY_HOST=0.0.0.0
export ULTRAFAST_GATEWAY_PORT=3000
export ULTRAFAST_GATEWAY_OPENAI_API_KEY=sk-...
export ULTRAFAST_GATEWAY_ANTHROPIC_API_KEY=sk-ant-...
export ULTRAFAST_GATEWAY_JWT_SECRET=your-secret
๐ก API Endpoints
Chat Completions
# OpenAI-compatible endpoint
POST /v1/chat/completions
Content-Type: application/json
Authorization: Bearer your-api-key
{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello, world!"}
],
"max_tokens": 100
}
Text Completions
# OpenAI-compatible endpoint
POST /v1/completions
Content-Type: application/json
Authorization: Bearer your-api-key
{
"model": "text-davinci-003",
"prompt": "Hello, world!",
"max_tokens": 100
}
Embeddings
POST /v1/embeddings
Content-Type: application/json
Authorization: Bearer your-api-key
{
"model": "text-embedding-ada-002",
"input": "Hello, world!"
}
Models List
GET /v1/models
Authorization: Bearer your-api-key
Health Check
GET /health
Metrics
GET /metrics
๐ Plugins
Content Filtering
use ultrafast_gateway::plugins::ContentFilteringPlugin;
let plugin = ContentFilteringPlugin::new()
.with_filters(vec![
"hate_speech".to_string(),
"violence".to_string(),
"sexual_content".to_string(),
])
.with_moderation_api("https://api.moderation.com");
gateway.add_plugin(plugin);
Cost Tracking
use ultrafast_gateway::plugins::CostTrackingPlugin;
let plugin = CostTrackingPlugin::new()
.with_cost_limits(vec![
("daily", 100.0),
("monthly", 1000.0),
])
.with_alert_threshold(0.8);
gateway.add_plugin(plugin);
Logging
use ultrafast_gateway::plugins::LoggingPlugin;
let plugin = LoggingPlugin::new()
.with_level(log::Level::Info)
.with_format(LogFormat::JSON)
.with_output(LogOutput::File("gateway.log".into()));
gateway.add_plugin(plugin);
๐ Monitoring & Analytics
Real-time Metrics
- Request Count: Total requests per provider
- Response Time: Average, P95, P99 response times
- Error Rates: Success/failure rates per provider
- Cache Hit Rate: Cache effectiveness metrics
- Rate Limiting: Throttled request counts
Dashboard
Access the built-in dashboard at /dashboard for:
- Real-time metrics visualization
- Provider health status
- Cache performance analytics
- Rate limiting statistics
- Error rate monitoring
Prometheus Integration
# prometheus.yml
scrape_configs:
- job_name: 'ultrafast-gateway'
static_configs:
- targets: ['localhost:9090']
metrics_path: '/metrics'
๐ Performance Tuning
Optimization Tips
// Enable connection pooling
let config = GatewayConfig::default()
.with_connection_pool_size(100)
.with_keep_alive_timeout(Duration::from_secs(60));
// Optimize cache settings
let cache_config = CacheConfig {
ttl: Duration::from_secs(3600),
max_size: 50000,
eviction_policy: EvictionPolicy::LRU,
compression: true,
};
// Configure circuit breakers
let circuit_breaker = CircuitBreakerConfig {
failure_threshold: 3,
recovery_timeout: Duration::from_secs(30),
request_timeout: Duration::from_secs(10),
half_open_max_calls: 5,
};
Benchmark Results
- Throughput: 10,000+ requests/second
- Latency: P99 < 50ms
- Memory: < 100MB baseline
- CPU: Efficient async processing
๐ณ Docker Deployment
Quick Start
docker run -p 3000:3000 \
-v /path/to/config:/app/config.toml \
ghcr.io/techgopal/ultrafast-ai-gateway:latest
Docker Compose
version: '3.8'
services:
ultrafast-gateway:
image: ghcr.io/techgopal/ultrafast-ai-gateway:latest
ports:
- "3000:3000"
- "9090:9090"
volumes:
- ./config.toml:/app/config.toml
- ./logs:/app/logs
environment:
- RUST_LOG=info
- RUST_BACKTRACE=1
restart: unless-stopped
Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: ultrafast-gateway
spec:
replicas: 3
selector:
matchLabels:
app: ultrafast-gateway
template:
metadata:
labels:
app: ultrafast-gateway
spec:
containers:
- name: gateway
image: ghcr.io/techgopal/ultrafast-ai-gateway:latest
ports:
- containerPort: 3000
- containerPort: 9090
env:
- name: RUST_LOG
value: "info"
volumeMounts:
- name: config
mountPath: /app/config.toml
subPath: config.toml
volumes:
- name: config
configMap:
name: gateway-config
๐ Security
Authentication Methods
- API Keys: Virtual API key management
- JWT Tokens: Stateless authentication
- OAuth 2.0: Third-party authentication
- Rate Limiting: Per-user and per-provider limits
Security Features
- Request Validation: Input sanitization
- Content Filtering: Moderation and filtering
- HTTPS Only: Secure communication
- CORS Configuration: Cross-origin resource sharing
- Audit Logging: Complete request tracking
๐งช Testing
Unit Tests
cargo test
Integration Tests
cargo test --test integration
Load Testing
cargo test --test load_testing
Benchmark Tests
cargo bench
๐ Examples
Basic Gateway
// examples/basic_gateway.rs
use ultrafast_gateway::{Gateway, GatewayConfig, ProviderConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let config = GatewayConfig::default()
.with_provider(ProviderConfig::openai("your-key"))
.with_cache_enabled(true);
let gateway = Gateway::new(config).await?;
gateway.serve("127.0.0.1:3000").await?;
Ok(())
}
Advanced Routing
// examples/advanced_routing.rs
use ultrafast_gateway::{
Gateway, GatewayConfig, ProviderConfig,
RoutingStrategy, LoadBalancingStrategy
};
let config = GatewayConfig::default()
.with_provider(ProviderConfig::openai("key1"))
.with_provider(ProviderConfig::anthropic("key2"))
.with_routing_strategy(RoutingStrategy::LoadBalanced(
LoadBalancingStrategy::RoundRobin
));
Custom Middleware
// examples/custom_middleware.rs
use ultrafast_gateway::{
Gateway, GatewayConfig, Middleware, Request, Response
};
struct CustomMiddleware;
#[async_trait]
impl Middleware for CustomMiddleware {
async fn process(&self, request: Request) -> Result<Response, Box<dyn std::error::Error>> {
// Custom processing logic
Ok(request.into())
}
}
let gateway = Gateway::new(config)
.with_middleware(CustomMiddleware)
.await?;
๐ค Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/techgopal/ultrafast-ai-gateway.git
cd ultrafast-ai-gateway
cargo build
cargo test
Code Style
- Follow Rust formatting guidelines
- Run
cargo fmtbefore committing - Ensure all tests pass with
cargo test - Add tests for new features
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Support
Documentation
Community
Commercial Support
For enterprise support and consulting, contact:
- Email: techgopal2@gmail.com
- GitHub: @techgopal
๐ Acknowledgments
- Rust Community: For the amazing language and ecosystem
- Tokio Team: For the async runtime
- OpenAI, Anthropic, Google: For their AI APIs
- Contributors: All who have helped improve this project
๐ Roadmap
v0.2.0 (Q2 2024)
- GraphQL API support
- WebSocket streaming
- Advanced analytics dashboard
- Plugin marketplace
v0.3.0 (Q3 2024)
- Multi-region deployment
- Advanced caching strategies
- Machine learning routing
- Enterprise SSO integration
v1.0.0 (Q4 2024)
- Production-ready stability
- Comprehensive documentation
- Performance benchmarks
- Enterprise features
Built with โค๏ธ in Rust by the Ultrafast Gateway Team
Dependencies
~21โ38MB
~564K SLoC