19 releases
| new 0.3.16 | Mar 1, 2026 |
|---|---|
| 0.3.15 | Feb 22, 2026 |
| 0.3.13 | Jan 25, 2026 |
| 0.3.7 | Dec 16, 2025 |
| 0.1.0 | Nov 19, 2025 |
#528 in Database interfaces
4,762 downloads per month
Used in 27 crates
(12 directly)
370KB
5.5K
SLoC
Table of Contents
GPU-First Embedded Analytics with SIMD Fallback
GPU-first embedded analytics database with graceful degradation: GPU → SIMD → Scalar
Features
- Cost-based dispatch: GPU only when compute > 5x transfer time
- Morsel-based paging: Out-of-core execution (128MB chunks)
- JIT WGSL compiler: Kernel fusion for single-pass execution
- GPU kernels: SUM, MIN, MAX, COUNT, AVG, fused filter+sum
- SIMD fallback: Trueno integration (AVX-512/AVX2/SSE2)
- SQL interface: SELECT, WHERE, aggregations, ORDER BY, LIMIT
Installation
[dependencies]
trueno-db = "0.3"
# Optional: GPU acceleration
trueno-db = { version = "0.3", features = ["gpu"] }
Quick Start
use trueno_db::query::{QueryEngine, QueryExecutor};
use trueno_db::storage::StorageEngine;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let storage = StorageEngine::load_parquet("data/events.parquet")?;
let engine = QueryEngine::new();
let executor = QueryExecutor::new();
let plan = engine.parse(
"SELECT COUNT(*), SUM(value), AVG(value) FROM events WHERE value > 100"
)?;
let result = executor.execute(&plan, &storage)?;
Ok(())
}
Performance
SIMD Aggregation (1M rows, AMD Threadripper 7960X):
| Operation | SIMD | Scalar | Speedup |
|---|---|---|---|
| SUM | 228µs | 634µs | 2.78x |
| MIN | 228µs | 1,048µs | 4.60x |
| MAX | 228µs | 257µs | 1.13x |
| AVG | 228µs | 634µs | 2.78x |
Architecture
┌─────────────────────────────────────────────┐
│ SQL Interface │
│ (QueryEngine / Parser) │
├─────────────────────────────────────────────┤
│ Query Executor │
│ (cost-based dispatch, morsel paging) │
├──────────┬──────────┬───────────────────────┤
│ GPU │ SIMD │ Scalar │
│ (WGSL) │ (Trueno) │ (fallback) │
├──────────┴──────────┴───────────────────────┤
│ Storage Engine │
│ (columnar, Parquet, morsel-based I/O) │
└─────────────────────────────────────────────┘
- Query Layer: SQL parser produces logical plans, cost-based optimizer selects GPU/SIMD/Scalar backend
- Execution Layer: Morsel-based paging (128MB chunks) enables out-of-core processing
- GPU Backend: JIT-compiled WGSL kernels with fused filter+aggregate passes
- SIMD Backend: Trueno-powered AVX-512/AVX2/SSE2 vectorized aggregations
- Storage Layer: Columnar storage with Parquet I/O and late materialization
API Reference
StorageEngine
Load and manage columnar data:
let storage = StorageEngine::load_parquet("data.parquet")?;
let storage = StorageEngine::from_batches(batches);
QueryEngine / QueryExecutor
Parse and execute SQL queries:
let engine = QueryEngine::new();
let plan = engine.parse("SELECT SUM(value) FROM data WHERE id > 10")?;
let result = QueryExecutor::new().execute(&plan, &storage)?;
GpuEngine
Direct GPU kernel access (requires gpu feature):
let gpu = GpuEngine::new().await?;
let sum = gpu.sum_i32(&data).await?;
let filtered = gpu.fused_filter_sum(&data, threshold, "gt").await?;
TopK
Heap-based top-K selection for ORDER BY ... LIMIT queries:
let top = top_k_descending(&batch, column_idx, k)?;
Examples
cargo run --example sql_query_interface --release
cargo run --example benchmark_shootout --release
cargo run --example gaming_leaderboards --release
cargo run --example gpu_aggregations --features gpu --release
Testing
cargo test --lib # Unit tests (66 tests)
cargo test # All tests including integration
cargo test --doc # Documentation tests
make quality-gate # Full quality gate: lint + test + coverage
Property-based tests cover storage invariants and top-K correctness via proptest.
Development
make build # Build
make test # Run tests
make quality-gate # lint + test + coverage
make bench-comparison
Contributing
Contributions are welcome! Please see the CONTRIBUTING.md guide for details.
MSRV
Minimum Supported Rust Version: 1.75
License
MIT
Dependencies
~32–75MB
~1M SLoC