2 unstable releases
new 0.2.0 | Mar 21, 2025 |
---|---|
0.1.0 | Nov 15, 2024 |
#472 in Database interfaces
125KB
2.5K
SLoC
block-db
Local, multi-threaded, durable byte DB.
Table of Contents
Installation
cargo add block-db
Overview
A BlockDB
manages a write-ahead log (WAL) and a collection of DataFile
s. Each DataFile
maintains its own WAL and a binary file that stores DataBlock
s — each composed of one or more chunks.
- The maximum size of a
DataFile
is configurable viamax_file_size
. - The size of each chunk within a
DataBlock
is configurable viachunk_size
.
Each DataBlock is associated with a BlockKey
, allowing BlockDB
to function as a persistent, atomic key-value store for arbitrary byte data.
BlockDB
is designed to be durable, minimal, and predictable. Unlike many modern storage engines, it avoids hidden memory buffers, background threads, or delayed writes — making it ideal for systems where correctness and resource footprint matter more than raw throughput.
A key benefit of this approach is an almost nonexistent memory footprint, along with an ergonomic and reliable foundation for building higher-level DBMS layers.
This project is still in its early stages, and as it's developed alongside other database systems, major optimizations and refinements will continue to be made over time.
Example
use block_db::{batch::BatchResult, BlockDB};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut block_db = BlockDB::open("./db", None).await?;
// Write bytes
let block_key_one = block_db.write(b"Hello").await?;
// Free bytes
let freed_bytes = block_db.free(&block_key_one).await?;
// 4_096 (Default `chunk_size`)
println!("{freed_bytes}");
// Write bytes in the previously freed space
let block_key_two = block_db.write(b"World!").await?;
// Batch writes then frees
let BatchResult {
freed_bytes,
new_block_keys,
} = block_db
.batch(vec![b"Hallo", b"Welt!"], vec![&block_key_two])
.await?;
// 4_096 (Default `chunk_size`)
println!("{freed_bytes}");
// Read Bytes
// None
println!("{:?}", block_db.read(&block_key_one).await?);
// None
println!("{:?}", block_db.read(&block_key_two).await?);
// Some(b"Hallo")
println!("{:?}", block_db.read(&new_block_keys[0]).await?);
// Some(b"Welt!")
println!("{:?}", block_db.read(&new_block_keys[1]).await?);
// Compact `DataFile`s by removing all free `DataBlock`s
block_db.compact_data_files().await?;
Ok(())
}
Usage Notes
This sections contains very useful information on how the Database works, but does not cover syntax in depth, for that: Full crate documentation can be found here at docs.rs.
Options
When constructing a BlockDB
, you can provide BlockDBOptions
to configure two key parameters:
-
chunk_size
: The size (in bytes) of a chunk within aDataBlock
. Default:4_096
-
max_file_size
: The maximum size (in bytes) of aDataFile
. Default:4_096_000_000
Mutability of Options
max_file_size
can be changed later by re-opening the BlockDB
with a new BlockDBOptions
.
chunk_size
, however, cannot be changed after the initial creation of the database.
Persistence
These options are stored on disk in JSON format. After the initial creation, you may pass None
to BlockDB::open
, and it will automatically load the previously stored options.
Write Distribution
The max_file_size option can be a bit misleading. Rather than being a strict size limit, it functions more like a soft threshold—and even then, it may only be exceeded once per DataFile.
Example
Consider a fresh BlockDB instance with max_file_size
set to 10 GB. Here's how writes are distributed:
1: Write 10 GB
- DataFile (1) is created and filled with a single 10 GB block.
2: Write 1 GB
- DataFile (1) is full, so DataFile (2) is created and receives the data.
3: Write 25 GB
- DataFile (2) is not full, so the entire 25 GB block is written to it—exceeding max_file_size in the process.
4: Write 1 GB (three times)
- DataFile (2) is now full, so DataFile (3) is created and receives the first 1 GB block.
- The next two 1 GB writes also go to DataFile (3), which is still not full.
Resulting Distribution
DataFile(1):
└── DataBlock(10 GB)
DataFile(2):
├── DataBlock(1 GB)
└── DataBlock(25 GB)
DataFile(3):
├── DataBlock(1 GB)
├── DataBlock(1 GB)
└── DataBlock(1 GB)
What if there are multiple non-full DataFile
s?
There is no internal-index of the creation order of DataFile
s, so if there are multiple non-full DataFile
s, the first detected non-full DataFile
is written to; if this is a feature you would like, to write to a specific DataFile
, please create a issue or PR.
Corruption
Some methods are annotated in their doc-comments as either Non-corruptible or Corruptible. If a method is marked Corruptible, it's important to understand how to handle potential corruption scenarios.
If you encounter an Error::Corrupted, it will only come from a method marked as Corruptible. This typically indicates an issue with filesystem (FS) or hardware stability, which has caused an operation to fail and left the system in a corrupted state.
Before proceeding, ensure that the FS and hardware are stable. Then, attempt to recover by calling BlockDB::uncorrupt, passing in the action extracted from the Error::Corrupted { action, .. }. This operation may also fail, but it will continue to return Error::Corrupted, allowing you to retry uncorruption multiple times safely and without overlap.
In the rare event that one or more DataFiles become deadlocked during an uncorrupt attempt, this signals a more serious issue—likely a problem with the write-ahead log (WAL) or the binary file itself. In such cases, automatic recovery is no longer possible.
Roadmap
-
Optimizations
-
DataBlock integrity feature
Contributing
Open to any contributions. All tests must pass, and the new features or changes should "make sense" based on the current API.
License
MIT License
Copyright (c) 2024 Robert Lopez
See LICENSE.md
Project status
I plan to continue maintaining this project for the foreseeable future.
Dependencies
~3–10MB
~96K SLoC