#gpu #analytics-database #wasm #analytics

bin+lib trueno-db

GPU-first embedded analytics database with SIMD fallback and SQL query interface

19 releases

new 0.3.16 Mar 1, 2026
0.3.15 Feb 22, 2026
0.3.13 Jan 25, 2026
0.3.7 Dec 16, 2025
0.1.0 Nov 19, 2025

#528 in Database interfaces

Download history 298/week @ 2025-11-16 1138/week @ 2025-11-23 1038/week @ 2025-11-30 1129/week @ 2025-12-07 857/week @ 2025-12-14 730/week @ 2025-12-21 622/week @ 2025-12-28 1934/week @ 2026-01-04 3112/week @ 2026-01-11 1648/week @ 2026-01-18 1511/week @ 2026-01-25 1147/week @ 2026-02-01 2173/week @ 2026-02-08 350/week @ 2026-02-15 1043/week @ 2026-02-22

4,762 downloads per month
Used in 27 crates (12 directly)

MIT license

370KB
5.5K SLoC

trueno-db

Table of Contents

GPU-First Embedded Analytics with SIMD Fallback

CI Crates.io Documentation License


GPU-first embedded analytics database with graceful degradation: GPU → SIMD → Scalar

Features

  • Cost-based dispatch: GPU only when compute > 5x transfer time
  • Morsel-based paging: Out-of-core execution (128MB chunks)
  • JIT WGSL compiler: Kernel fusion for single-pass execution
  • GPU kernels: SUM, MIN, MAX, COUNT, AVG, fused filter+sum
  • SIMD fallback: Trueno integration (AVX-512/AVX2/SSE2)
  • SQL interface: SELECT, WHERE, aggregations, ORDER BY, LIMIT

Installation

[dependencies]
trueno-db = "0.3"

# Optional: GPU acceleration
trueno-db = { version = "0.3", features = ["gpu"] }

Quick Start

use trueno_db::query::{QueryEngine, QueryExecutor};
use trueno_db::storage::StorageEngine;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let storage = StorageEngine::load_parquet("data/events.parquet")?;
    let engine = QueryEngine::new();
    let executor = QueryExecutor::new();

    let plan = engine.parse(
        "SELECT COUNT(*), SUM(value), AVG(value) FROM events WHERE value > 100"
    )?;
    let result = executor.execute(&plan, &storage)?;
    Ok(())
}

Performance

SIMD Aggregation (1M rows, AMD Threadripper 7960X):

Operation SIMD Scalar Speedup
SUM 228µs 634µs 2.78x
MIN 228µs 1,048µs 4.60x
MAX 228µs 257µs 1.13x
AVG 228µs 634µs 2.78x

Architecture

┌─────────────────────────────────────────────┐
│              SQL Interface                   │
│         (QueryEngine / Parser)               │
├─────────────────────────────────────────────┤
│           Query Executor                     │
│    (cost-based dispatch, morsel paging)       │
├──────────┬──────────┬───────────────────────┤
│  GPUSIMD    │  Scalar               │
│  (WGSL)(Trueno)(fallback)           │
├──────────┴──────────┴───────────────────────┤
│         Storage Engine                       │
│   (columnar, Parquet, morsel-based I/O)      │
└─────────────────────────────────────────────┘
  • Query Layer: SQL parser produces logical plans, cost-based optimizer selects GPU/SIMD/Scalar backend
  • Execution Layer: Morsel-based paging (128MB chunks) enables out-of-core processing
  • GPU Backend: JIT-compiled WGSL kernels with fused filter+aggregate passes
  • SIMD Backend: Trueno-powered AVX-512/AVX2/SSE2 vectorized aggregations
  • Storage Layer: Columnar storage with Parquet I/O and late materialization

API Reference

StorageEngine

Load and manage columnar data:

let storage = StorageEngine::load_parquet("data.parquet")?;
let storage = StorageEngine::from_batches(batches);

QueryEngine / QueryExecutor

Parse and execute SQL queries:

let engine = QueryEngine::new();
let plan = engine.parse("SELECT SUM(value) FROM data WHERE id > 10")?;
let result = QueryExecutor::new().execute(&plan, &storage)?;

GpuEngine

Direct GPU kernel access (requires gpu feature):

let gpu = GpuEngine::new().await?;
let sum = gpu.sum_i32(&data).await?;
let filtered = gpu.fused_filter_sum(&data, threshold, "gt").await?;

TopK

Heap-based top-K selection for ORDER BY ... LIMIT queries:

let top = top_k_descending(&batch, column_idx, k)?;

Examples

cargo run --example sql_query_interface --release
cargo run --example benchmark_shootout --release
cargo run --example gaming_leaderboards --release
cargo run --example gpu_aggregations --features gpu --release

Testing

cargo test --lib          # Unit tests (66 tests)
cargo test                # All tests including integration
cargo test --doc          # Documentation tests
make quality-gate         # Full quality gate: lint + test + coverage

Property-based tests cover storage invariants and top-K correctness via proptest.

Development

make build           # Build
make test            # Run tests
make quality-gate    # lint + test + coverage
make bench-comparison

Contributing

Contributions are welcome! Please see the CONTRIBUTING.md guide for details.

MSRV

Minimum Supported Rust Version: 1.75

License

MIT

Dependencies

~32–75MB
~1M SLoC