4 releases (2 breaking)
Uses new Rust 2024
new 0.3.1 | Apr 29, 2025 |
---|---|
0.3.0 | Apr 27, 2025 |
0.2.0 | Apr 24, 2025 |
0.1.0 | Apr 11, 2025 |
#27 in Accessibility
321 downloads per month
Used in anda_db
98KB
1.5K
SLoC
Anda-DB B-tree Index Library
A high-performance B-tree based index implementation for Anda-DB, optimized for concurrent access and efficient range queries.
Features
- Support for various data types: Index fields of u64, i64, String, binary data and more
- Efficient range queries: Optimized for fast range-based lookups
- Prefix search: Specialized support for string prefix searches
- Efficient serialization: Fast CBOR-based serialization and deserialization
- Incremental Persistent: Support incremental index updates persistent (insertions and deletions)
- Thread-safe concurrent access: Safely use the index from multiple threads
Usage
Add this to your Cargo.toml
:
[dependencies]
anda_db_btree = "0.1.0"
Basic Example
use std::io::{Read, Write};
use anda_db_btree::{BTreeConfig, BTreeIndex, RangeQuery};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a new B-tree index
let config = BTreeConfig {
bucket_overload_size: 1024 * 512, // 512KB per bucket
allow_duplicates: true,
};
let index = BTreeIndex::<u64, String>::new("my_index".to_string(), Some(config));
// Insert some data
let now_ms = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap()
.as_millis() as u64;
let apple = "apple".to_string();
let banana = "banana".to_string();
let cherry = "cherry".to_string();
let date = "date".to_string();
let berry = "berry".to_string();
index.insert(1, apple.clone(), now_ms).unwrap();
index.insert(2, banana.clone(), now_ms).unwrap();
index.insert(3, cherry.clone(), now_ms).unwrap();
index.insert(4, date.clone(), now_ms).unwrap();
index.insert(5, berry.clone(), now_ms).unwrap();
// Search for exact matches
let result = index.search_with(&apple, |ids| Some(ids.clone()));
assert!(result.is_some());
println!("Documents with 'apple': {:?}", result.unwrap());
// Range queries
let query = RangeQuery::Between(banana.clone(), date.clone());
let results = index.search_range_with(query, |k, ids| {
println!("Key: {}, IDs: {:?}", k, ids);
(true, vec![k.clone()])
});
println!("Keys in range: {:?}", results);
// Prefix search (for String keys)
let results =
index.search_prefix_with("app", |k, ids| (true, Some((k.to_string(), ids.clone()))));
println!("Keys with prefix 'app': {:?}", results);
// persist the index to files
{
let metadata = std::fs::File::create("debug/btree_demo/metadata.cbor")?;
index
.flush(metadata, now_ms, async |id, data| {
let mut bucket =
std::fs::File::create(format!("debug/btree_demo/bucket_{id}.cbor"))?;
bucket.write_all(data)?;
Ok(true)
})
.await?;
}
// Load the index from metadata
let mut index2 = BTreeIndex::<String, u64>::load_metadata(std::fs::File::open(
"debug/btree_demo/metadata.cbor",
)?)?;
assert_eq!(index2.name(), "my_index");
assert_eq!(index2.len(), 0);
// Load the index data
index2
.load_buckets(async |id: u32| {
let mut file = std::fs::File::open(format!("debug/btree_demo/bucket_{id}.cbor"))?;
let mut data = Vec::new();
file.read_to_end(&mut data)?;
Ok(Some(data))
})
.await?;
assert_eq!(index2.len(), 5);
let result = index.search_with(&apple, |ids| Some(ids.clone()));
assert!(result.is_some());
// Remove data
let ok = index.remove(1, apple.clone(), now_ms);
assert!(ok);
let result = index.search_with(&apple, |ids| Some(ids.clone()));
assert!(result.is_none());
println!("OK");
Ok(())
}
Configuration
The BTreeConfig
struct allows customizing the index behavior:
let config = BTreeConfig {
// Maximum size of a bucket before creating a new one (in bytes)
bucket_overload_size: 1024 * 512, // 512KB
// Whether to allow duplicate keys
// If false, attempting to insert a duplicate key will result in an error
allow_duplicates: true,
};
Performance Considerations
- Bucket Size: Adjust
bucket_overload_size
based on your data characteristics. - Concurrency: The index is designed for concurrent access, using lock-free data structures where possible.
- Memory Usage: The index keeps all data in memory for fast access. For very large datasets, consider using multiple smaller indices.
Error Handling
The library provides a comprehensive error type BTreeError
that covers various failure scenarios:
BTreeError::Generic
: General index-related errorsBTreeError::Serialization
: CBOR serialization/deserialization errorsBTreeError::NotFound
: When a requested value is not found in the indexBTreeError::AlreadyExists
: When trying to insert a duplicate key withallow_duplicates
set to false
License
Copyright © 2025 LDC Labs.
ldclabs/anda-db
is licensed under the MIT License. See LICENSE for the full license text.
Dependencies
~2.6–8.5MB
~71K SLoC