2 releases

0.3.1 Feb 15, 2026
0.3.0 Jan 8, 2026

#768 in Compression


Used in trueno-zram-cli

MIT/Apache

480KB
11K SLoC

ML-driven compression algorithm selection for trueno-zram.

This crate provides entropy-based analysis and machine learning models to select the optimal compression algorithm for each memory page.

GPU Routing

The BatchClassifier implements intelligent routing of compression workloads to scalar, SIMD, or GPU backends based on the 5x PCIe rule.


trueno-zram

trueno-zram

SIMD-accelerated LZ4/ZSTD compression with ublk block device for transparent memory compression.

Table of Contents

Crates.io Documentation License Coverage

19x faster than kernel zram on AVX-512 systems


Features

  • 19x faster decompression than kernel zram via AVX-512 ZSTD
  • Adaptive compression with entropy-based algorithm selection (LZ4/ZSTD)
  • Userspace block device via Linux ublk for zero-copy I/O
  • SIMD-accelerated compression on x86_64 (AVX2/AVX-512) and AArch64 (NEON)

Why trueno-zram?

Kernel zram uses scalar LZ4 limited to 1.3 GB/s. trueno-zram uses AVX-512 ZSTD achieving 25 GB/s decompression. Same workload, 19x faster.

┌─────────────────────────────────────────────────────────────┐
│  Decompression Throughput (same-fill pages)                 │
├─────────────────────────────────────────────────────────────┤
│  trueno-zram ZSTD  ████████████████████████████  25.3 GB/s  │
│  Kernel zram LZ4   █░                             1.3 GB/s  │
└─────────────────────────────────────────────────────────────┘

Installation

# Install
cargo install trueno-ublk

# Create 8GB compressed swap (requires root)
sudo trueno-ublk create --size 8G --algorithm zstd1 --high-perf

# Enable as swap
sudo mkswap /dev/ublkb0 && sudo swapon -p 150 /dev/ublkb0

# Verify
swapon --show

Problems It Solves

For Power Users (Workstations, Servers)

Problem Without trueno-zram With trueno-zram
ML Training Memory Spikes OOM killer terminates training 8GB+ compressed swap absorbs spikes at 25 GB/s
Parallel Compilation cargo build with 48 threads → OOM Overflow handled, builds complete
Unused AVX-512 Kernel ignores your CPU's SIMD Full AVX-512 utilization → 19x speedup
GPU Memory Pressure Slow disk swap stalls CUDA Fast RAM swap keeps GPU fed
Multi-Project Development System freezes, requires reboot Smooth operation under memory pressure

For Everyone

Problem Without With trueno-zram
Cloud VMs (4-8GB RAM) Pay 2x for larger instance Effective 2x RAM via compression
Laptop SSD Lifespan 10-100GB swap writes/day 90% stays in RAM, SSD lasts 5x longer
Raspberry Pi / Edge SD card swap at 10 MB/s Compressed RAM at GB/s speeds
Database Latency Query timeouts when swapping <1ms swap latency, queries complete
CI/CD Build Servers OOM kills parallel builds Builds succeed without RAM upgrades

Performance

Benchmarked on AMD Threadripper 7960X with AVX-512:

Metric trueno-zram Kernel zram Speedup
ZSTD Decompress 25.3 GB/s 1.3 GB/s 19x
ZSTD Compress 15.5 GB/s 1.3 GB/s 12x
Same-fill Read 7.9 GB/s 171 GB/s* -
Random 4K IOPS 666K ~1.5M 0.4x

*Kernel same-fill uses direct memset, not compression.

Examples

# Algorithm comparison benchmark
cargo run --release --example zstd_vs_lz4

# Tiered storage architecture demo
cargo run --release --example tiered_storage

# Visualization integration demo
cargo run --release --example visualization_demo

Example Output: ZSTD vs LZ4

W0-ZEROS (same-fill pages)
Algorithm            Compress     Decompress    Ratio
--------------------------------------------------------
  LZ4              5.78 GiB/s      0.78 GiB/s   157.5x
  ZSTD-1           3.29 GiB/s     25.28 GiB/s   372.4x

  ZSTD-1 speedup: 32x faster decompression

Usage

use trueno_zram_core::{CompressorBuilder, Algorithm, PageCompressor, PAGE_SIZE};

// Create compressor with ZSTD level 1 (fastest)
let compressor = CompressorBuilder::new()
    .algorithm(Algorithm::Zstd1)
    .build()?;

// Compress a 4KB page
let page = [0u8; PAGE_SIZE];
let compressed = compressor.compress(&page)?;
let decompressed = compressor.decompress(&compressed)?;

assert_eq!(page, decompressed);

API Reference

trueno-ublk create    # Create compressed RAM device
trueno-ublk list      # List devices
trueno-ublk stat      # Show statistics
trueno-ublk top       # Interactive TUI dashboard
trueno-ublk benchmark # Run benchmarks with JSON/HTML output
trueno-ublk entropy   # Analyze file entropy

High-Performance Modes

# High-perf: polling mode, larger batches
sudo trueno-ublk create --size 8G --algorithm zstd1 --high-perf

# Max-perf: highest throughput (high CPU usage)
sudo trueno-ublk create --size 8G --algorithm zstd1 --max-perf --queues 4

# Tiered: entropy-based routing to kernel zram
sudo trueno-ublk create --size 8G --backend tiered --entropy-routing

Architecture

┌─────────────────────────────────────────────┐
│            trueno-ublk CLI                   │
│  (create, list, stat, top, benchmark)        │
├─────────────────────────────────────────────┤
│         Userspace Block Device               │
│  (io_uring, ublk protocol, queue mgmt)       │
├──────────┬──────────────────────────────────┤
│ Adaptive │  trueno-zram-core                 │
│ (entropy │  (SIMD compression engine)        │
│  routing)│                                   │
├──────────┴──────────┬───────────────────────┤
│  ZSTD (AVX-512)LZ4 (SIMD)           │
│  25 GB/s decompress │  5.8 GB/s compress     │
└─────────────────────┴───────────────────────┘
  • Core: SIMD-accelerated page compression/decompression (4KB pages)
  • Adaptive: Entropy-based algorithm selection routes pages to optimal compressor
  • Ublk: Linux userspace block device using io_uring for zero-copy I/O
  • Tiered: Entropy routing to kernel zram for incompressible data

Testing

cargo test -p trueno-zram-core    # Core compression tests
cargo test                        # All workspace tests
cargo bench -p trueno-zram-core   # Compression benchmarks
make quality-gate                 # Full quality gate: lint + test + coverage

Property-based tests verify compression roundtrip invariants across page patterns (zeros, random, mixed).

Project Structure

Crate Description
trueno-zram-core SIMD compression library (LZ4, ZSTD, AVX-512)
trueno-zram-adaptive Entropy-based algorithm selection
trueno-ublk Userspace block device daemon

Requirements

  • Linux Kernel >= 6.0 (ublk support)
  • x86_64 with AVX2/AVX-512 or AArch64 with NEON
  • Rust >= 1.82.0

Benchmarking

# Compression library benchmark
cargo bench -p trueno-zram-core

# Device I/O benchmark (requires root)
sudo fio --name=test --filename=/dev/ublkb0 --rw=randread \
    --bs=4k --numjobs=4 --iodepth=32 --runtime=10

# Generate HTML benchmark report
trueno-ublk benchmark --size 4G --format html -o report.html

Documentation

Contributing

Contributions are welcome! Please see the CONTRIBUTING.md guide for details.

Areas of interest:

  • Performance optimizations (SIMD, compression algorithms)
  • New compression algorithm support
  • Documentation improvements
  • Bug fixes

MSRV

Minimum Supported Rust Version: 1.82

License

MIT OR Apache-2.0


Part of the PAIML ecosystem: trueno | aprender | bashrs

25 GB/s decompression. 19x faster than kernel zram. Use it today.

Dependencies

~130–510KB
~12K SLoC