#db #dbms #byte

block-db

Local, multi-threaded, durable byte DB

2 unstable releases

new 0.2.0 Mar 21, 2025
0.1.0 Nov 15, 2024

#472 in Database interfaces

MIT license

125KB
2.5K SLoC

block-db

Local, multi-threaded, durable byte DB.

Table of Contents

Installation

cargo add block-db

Overview

A BlockDB manages a write-ahead log (WAL) and a collection of DataFiles. Each DataFile maintains its own WAL and a binary file that stores DataBlocks — each composed of one or more chunks.

  • The maximum size of a DataFile is configurable via max_file_size.
  • The size of each chunk within a DataBlock is configurable via chunk_size.

Each DataBlock is associated with a BlockKey, allowing BlockDB to function as a persistent, atomic key-value store for arbitrary byte data.

BlockDB is designed to be durable, minimal, and predictable. Unlike many modern storage engines, it avoids hidden memory buffers, background threads, or delayed writes — making it ideal for systems where correctness and resource footprint matter more than raw throughput.

A key benefit of this approach is an almost nonexistent memory footprint, along with an ergonomic and reliable foundation for building higher-level DBMS layers.

This project is still in its early stages, and as it's developed alongside other database systems, major optimizations and refinements will continue to be made over time.

Example

example.rs

use block_db::{batch::BatchResult, BlockDB};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut block_db = BlockDB::open("./db", None).await?;

    // Write bytes
    let block_key_one = block_db.write(b"Hello").await?;

    // Free bytes
    let freed_bytes = block_db.free(&block_key_one).await?;

    // 4_096 (Default `chunk_size`)
    println!("{freed_bytes}");

    // Write bytes in the previously freed space
    let block_key_two = block_db.write(b"World!").await?;

    // Batch writes then frees
    let BatchResult {
        freed_bytes,
        new_block_keys,
    } = block_db
        .batch(vec![b"Hallo", b"Welt!"], vec![&block_key_two])
        .await?;

    // 4_096 (Default `chunk_size`)
    println!("{freed_bytes}");

    // Read Bytes

    // None
    println!("{:?}", block_db.read(&block_key_one).await?);

    // None
    println!("{:?}", block_db.read(&block_key_two).await?);

    // Some(b"Hallo")
    println!("{:?}", block_db.read(&new_block_keys[0]).await?);

    // Some(b"Welt!")
    println!("{:?}", block_db.read(&new_block_keys[1]).await?);

    // Compact `DataFile`s by removing all free `DataBlock`s
    block_db.compact_data_files().await?;

    Ok(())
}

Usage Notes

This sections contains very useful information on how the Database works, but does not cover syntax in depth, for that: Full crate documentation can be found here at docs.rs.

Options

When constructing a BlockDB, you can provide BlockDBOptions to configure two key parameters:

  • chunk_size: The size (in bytes) of a chunk within a DataBlock. Default: 4_096

  • max_file_size: The maximum size (in bytes) of a DataFile. Default: 4_096_000_000

Mutability of Options

max_file_size can be changed later by re-opening the BlockDB with a new BlockDBOptions. chunk_size, however, cannot be changed after the initial creation of the database.

Persistence

These options are stored on disk in JSON format. After the initial creation, you may pass None to BlockDB::open, and it will automatically load the previously stored options.

Write Distribution

The max_file_size option can be a bit misleading. Rather than being a strict size limit, it functions more like a soft threshold—and even then, it may only be exceeded once per DataFile.

Example

Consider a fresh BlockDB instance with max_file_size set to 10 GB. Here's how writes are distributed:

1: Write 10 GB

  • DataFile (1) is created and filled with a single 10 GB block.

2: Write 1 GB

  • DataFile (1) is full, so DataFile (2) is created and receives the data.

3: Write 25 GB

  • DataFile (2) is not full, so the entire 25 GB block is written to it—exceeding max_file_size in the process.

4: Write 1 GB (three times)

  • DataFile (2) is now full, so DataFile (3) is created and receives the first 1 GB block.
  • The next two 1 GB writes also go to DataFile (3), which is still not full.

Resulting Distribution

DataFile(1):
 └── DataBlock(10 GB)

DataFile(2):
 ├── DataBlock(1 GB)
 └── DataBlock(25 GB)

DataFile(3):
 ├── DataBlock(1 GB)
 ├── DataBlock(1 GB)
 └── DataBlock(1 GB)

What if there are multiple non-full DataFiles?

There is no internal-index of the creation order of DataFiles, so if there are multiple non-full DataFiles, the first detected non-full DataFile is written to; if this is a feature you would like, to write to a specific DataFile, please create a issue or PR.

Corruption

Some methods are annotated in their doc-comments as either Non-corruptible or Corruptible. If a method is marked Corruptible, it's important to understand how to handle potential corruption scenarios.

If you encounter an Error::Corrupted, it will only come from a method marked as Corruptible. This typically indicates an issue with filesystem (FS) or hardware stability, which has caused an operation to fail and left the system in a corrupted state.

Before proceeding, ensure that the FS and hardware are stable. Then, attempt to recover by calling BlockDB::uncorrupt, passing in the action extracted from the Error::Corrupted { action, .. }. This operation may also fail, but it will continue to return Error::Corrupted, allowing you to retry uncorruption multiple times safely and without overlap.

In the rare event that one or more DataFiles become deadlocked during an uncorrupt attempt, this signals a more serious issue—likely a problem with the write-ahead log (WAL) or the binary file itself. In such cases, automatic recovery is no longer possible.

Roadmap

  • Optimizations

  • DataBlock integrity feature

Contributing

Open to any contributions. All tests must pass, and the new features or changes should "make sense" based on the current API.

License

MIT License

Copyright (c) 2024 Robert Lopez

See LICENSE.md

Project status

I plan to continue maintaining this project for the foreseeable future.

Dependencies

~3–10MB
~96K SLoC