8 stable releases
| new 3.1.0 | Oct 17, 2025 |
|---|---|
| 3.0.0 | Jun 17, 2025 |
| 2.2.1 | Jan 14, 2025 |
| 2.1.0 | Oct 4, 2024 |
| 1.0.0-rc.1 | Jun 26, 2023 |
#1609 in Database interfaces
159 downloads per month
Used in 3 crates
55KB
1K
SLoC
GroveDB
| Branch | Tests | Coverage |
|---|---|---|
| master |
GroveDB: Hierarchical Authenticated Data Structure Database
GroveDB is a high-performance, cryptographically verifiable database system that implements a hierarchical authenticated data structure - organizing data as a "grove" where each tree in the forest is a Merkle AVL tree (Merk). This revolutionary approach solves the fundamental limitations of flat authenticated data structures by enabling efficient queries on any indexed field while maintaining cryptographic proofs throughout the hierarchy.
Built on cutting-edge research in hierarchical authenticated data structures, GroveDB provides the foundational storage layer for Dash Platform while being flexible enough for any application requiring trustless data verification.
Table of Contents
- Key Features
- Architecture Overview
- Core Concepts
- Getting Started
- Usage Examples
- Query System
- Performance
- Documentation
- Contributing
Key Features
π³ Hierarchical Tree-of-Trees Structure
- Organize data naturally in nested hierarchies
- Each subtree is a fully authenticated Merkle AVL tree
- Efficient navigation and organization of complex data
π Efficient Secondary Index Queries
- Pre-computed secondary indices stored as subtrees
- O(log n) query performance on any indexed field
- No need to scan entire dataset for non-primary key queries
π Cryptographic Proofs
- Generate proofs for any query result
- Supports membership, non-membership, and range proofs
- Minimal proof sizes through optimized algorithms
- Layer-by-layer verification from root to leaves
π High Performance
- Built on RocksDB for reliable persistent storage
- Batch operations for atomic updates across multiple trees
- Intelligent caching system (MerkCache) for frequently accessed data
- Cost tracking for all operations
π Advanced Reference System
- 7 types of references for complex data relationships
- Automatic reference following (configurable hop limits)
- Cycle detection prevents infinite loops
- Cross-tree data linking without duplication
π Built-in Aggregations
- Sum trees for automatic value totaling
- Count trees for element counting
- Combined count+sum trees
- Big sum trees for 256-bit integers
π Cross-Platform Support
- Native Rust implementation
- Runs on x86, ARM (including Raspberry Pi), and WebAssembly
The Forest Architecture: Why Hierarchical Matters
Traditional authenticated data structures face a fundamental limitation: they can only efficiently prove queries on a single index (typically the primary key). Secondary index queries require traversing the entire structure, resulting in large proofs and poor performance.
GroveDB's breakthrough is using a hierarchical authenticated data structure - a forest where each tree is a Merk (Merkle AVL tree). This architecture enables:
π² The Forest Metaphor
- Grove: The entire database - a forest of interconnected trees
- Trees: Individual Merk trees, each serving as either:
- Data Trees: Storing actual key-value pairs
- Index Trees: Storing references for secondary indices
- Aggregate Trees: Maintaining sums, counts, or other computations
- Root Hash: A single cryptographic commitment to the entire forest state
π Hierarchical Authentication
Each Merk tree maintains its own root hash, and parent trees store these hashes as values. This creates a hierarchy where:
- The topmost tree's root hash authenticates the entire database
- Each subtree can be independently verified
- Proofs can be generated for any path through the hierarchy
- Updates propagate upward, maintaining consistency
π Efficiency Gains
By pre-computing and storing secondary indices as separate trees:
- Query any index with O(log n) complexity
- Generate minimal proofs (only the path taken)
- Update indices atomically with data
- Maintain multiple views of the same data
Architecture Overview
GroveDB combines several innovative components:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GroveDB Core β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
β β Element β β Query β β Proof β β
β β System β β Engine β β Generator β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
β β Batch β β Reference β β Version β β
β β Operations β β Resolver β β Management β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Merk Layer β
β (Merkle AVL Tree Implementation) β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
β β AVL Tree β β Proof β β Cost β β
β β Balancing β β System β β Tracking β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Storage Layer β
β (RocksDB Abstraction) β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
β β Prefixed β β Transaction β β Batch β β
β β Storage β β Support β β Processing β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Component Details
- GroveDB Core: Orchestrates multiple Merk trees into a unified hierarchical database
- Merk: High-performance Merkle AVL tree implementation with proof generation
- Storage: Abstract storage layer with RocksDB backend, supporting transactions and batching
- Costs: Comprehensive resource tracking for all operations
- Version Management: Protocol versioning for smooth upgrades
Core Concepts
The Foundation: Merk Trees
At the heart of GroveDB's forest are Merk trees - highly optimized Merkle AVL trees that serve as the building blocks of the hierarchical structure:
- Self-Balancing: AVL algorithm ensures O(log n) operations
- Authenticated: Every node contains cryptographic hashes
- Efficient Proofs: Generate compact proofs for any query
- Rich Features: Built-in support for sums, counts, and aggregations
Each Merk tree in the grove can reference other Merk trees, creating a powerful hierarchical system where authentication flows from leaves to root.
Elements
GroveDB supports 8 element types:
// Basic storage
Element::Item(value, flags) // Arbitrary bytes
Element::Reference(path, max_hops) // Link to another element
Element::Tree(root_hash) // Subtree container
// Aggregation types
Element::SumItem(value) // Contributes to sum
Element::SumTree(root_hash, sum) // Maintains sum of descendants
Element::BigSumTree(root_hash, sum) // 256-bit sums
Element::CountTree(root_hash, count) // Element counting
Element::CountSumTree(root_hash, count, sum) // Combined
Hierarchical Paths
Data is organized using paths:
// Path: ["users", "alice", "documents"]
db.insert(
&["users", "alice"],
b"balance",
Element::new_item(b"100")
)?;
Reference Types
Seven reference types enable complex relationships:
AbsolutePathReference: Direct path from rootUpstreamRootHeightReference: Go up N levels, then follow pathUpstreamFromElementHeightReference: Relative to current elementCousinReference: Same level, different branchSiblingReference: Same parent treeUtilityReference: Special system references
Getting Started
Requirements
- Rust 1.74+ (nightly)
- RocksDB dependencies
Installation
Add to your Cargo.toml:
[dependencies]
grovedb = "3.0"
Basic Setup
use grovedb::{GroveDb, Element};
use grovedb_version::version::GroveVersion;
// Open database
let db = GroveDb::open("./my_db")?;
let grove_version = GroveVersion::latest();
// Create a tree structure
db.insert(&[], b"users", Element::new_tree(None), None, None, grove_version)?;
db.insert(&[b"users"], b"alice", Element::new_tree(None), None, None, grove_version)?;
// Insert data
db.insert(
&[b"users", b"alice"],
b"age",
Element::new_item(b"30"),
None,
None,
grove_version
)?;
// Query data
let age = db.get(&[b"users", b"alice"], b"age", None, grove_version)?;
Usage Examples
Building Your Forest: From Trees to Grove
The following examples demonstrate how individual Merk trees combine to form a powerful hierarchical database.
Conceptual Structure
π² Grove Root (Single Merk Tree)
βββ π users (Merk Tree)
β βββ π€ alice (Merk Tree)
β β βββ name: "Alice"
β β βββ age: 30
β β βββ city: "Boston"
β βββ π€ bob (Merk Tree)
β βββ name: "Bob"
β βββ age: 25
βββ π indexes (Merk Tree)
β βββ by_age (Merk Tree)
β β βββ 25 β Reference(/users/bob)
β β βββ 30 β Reference(/users/alice)
β βββ by_city (Merk Tree)
β βββ Boston β Reference(/users/alice)
βββ π° accounts (Sum Tree - Special Merk)
βββ alice: 100 (contributes to sum)
βββ bob: 200 (contributes to sum)
βββ [Automatic sum: 300]
Each node marked as "Merk Tree" is an independent authenticated data structure with its own root hash, all linked together in the hierarchy.
Creating Secondary Indexes
// Create user data
db.insert(&[b"users"], b"user1", Element::new_tree(None), None, None, grove_version)?;
db.insert(&[b"users", b"user1"], b"age", Element::new_item(b"25"), None, None, grove_version)?;
db.insert(&[b"users", b"user1"], b"city", Element::new_item(b"Boston"), None, None, grove_version)?;
// Create indexes
db.insert(&[], b"indexes", Element::new_tree(None), None, None, grove_version)?;
db.insert(&[b"indexes"], b"by_age", Element::new_tree(None), None, None, grove_version)?;
db.insert(&[b"indexes"], b"by_city", Element::new_tree(None), None, None, grove_version)?;
// Add references in indexes
db.insert(
&[b"indexes", b"by_age"],
b"25_user1",
Element::new_reference(ReferencePathType::absolute_path(vec![
b"users".to_vec(),
b"user1".to_vec()
])),
None,
None,
grove_version
)?;
Using Sum Trees
// Create account structure with balances
db.insert(&[], b"accounts", Element::new_sum_tree(None, 0), None, None, grove_version)?;
// Add accounts with balances
db.insert(&[b"accounts"], b"alice", Element::new_sum_item(100), None, None, grove_version)?;
db.insert(&[b"accounts"], b"bob", Element::new_sum_item(200), None, None, grove_version)?;
db.insert(&[b"accounts"], b"charlie", Element::new_sum_item(150), None, None, grove_version)?;
// Get total sum (automatically maintained)
let sum_tree = db.get(&[], b"accounts", None, grove_version)?;
// sum_tree now contains Element::SumTree with sum = 450
Batch Operations
use grovedb::batch::GroveDbOp;
let ops = vec![
GroveDbOp::insert_op(vec![b"users"], b"alice", Element::new_tree(None)),
GroveDbOp::insert_op(vec![b"users", b"alice"], b"name", Element::new_item(b"Alice")),
GroveDbOp::insert_op(vec![b"users", b"alice"], b"age", Element::new_item(b"30")),
];
// Apply atomically
db.apply_batch(ops, None, None, grove_version)?;
Generating Proofs
use grovedb::query::PathQuery;
use grovedb_merk::proofs::Query;
// Create a path query
let path_query = PathQuery::new_unsized(
vec![b"users".to_vec()],
Query::new_range_full(),
);
// Generate proof
let proof = db.prove_query(&path_query, None, None, grove_version)?;
// Verify proof independently
let (root_hash, results) = GroveDb::verify_query(proof.as_slice(), &path_query, grove_version)?;
Query System
Basic Queries
// Get all items in a subtree
let query = Query::new_range_full();
let path_query = PathQuery::new_unsized(vec![b"users".to_vec()], query);
let results = db.query(&path_query, false, false, None, grove_version)?;
Range Queries
// Get users with names from "A" to "M"
let mut query = Query::new();
query.insert_range(b"A".to_vec()..b"N".to_vec());
let path_query = PathQuery::new_unsized(vec![b"users".to_vec()], query);
let results = db.query(&path_query, false, false, None, grove_version)?;
Complex Queries with Subqueries
// Get all users and their documents
let mut query = Query::new_with_subquery_key(b"documents".to_vec());
let path_query = PathQuery::new_unsized(vec![b"users".to_vec()], query);
let results = db.query(&path_query, false, false, None, grove_version)?;
Query Types
GroveDB supports 10 query item types:
Key(key)- Exact key matchRange(start..end)- Exclusive rangeRangeInclusive(start..=end)- Inclusive rangeRangeFull(..)- All keysRangeFrom(start..)- From key onwardsRangeTo(..end)- Up to keyRangeToInclusive(..=end)- Up to and including keyRangeAfter(prev..)- After specific keyRangeAfterTo(prev..end)- After key up to endRangeAfterToInclusive(prev..=end)- After key up to and including end
Advanced Query Features (v2+)
Parent Tree Inclusion: When performing subqueries, you can include the parent tree element itself in the results:
let mut query = Query::new();
query.insert_key(b"users".to_vec());
query.set_subquery(Query::new_range_full());
query.add_parent_tree_on_subquery = true; // Include parent tree
let path_query = PathQuery::new_unsized(vec![], query);
let results = db.query(&path_query, false, false, None, grove_version)?;
// Results include both the "users" tree element AND its contents
This is particularly useful for count trees and sum trees where you want both the aggregate value and the individual elements.
Performance
The Power of Hierarchical Structure
The forest architecture delivers exceptional performance by leveraging the hierarchical nature of Merk trees:
Query Performance
- Primary Index: O(log n) - Direct path through single Merk tree
- Secondary Index: O(log n) - Pre-computed index trees eliminate full scans
- Proof Generation: O(log n) - Only nodes on the query path
- Proof Size: Minimal - Proportional to tree depth, not data size
Compare this to flat structures where secondary index queries require O(n) scans and generate O(n) sized proofs!
Benchmarks
Performance on different hardware:
| Hardware | Full Test Suite |
|---|---|
| Raspberry Pi 4 | 2m 58s |
| AMD Ryzen 5 1600AF | 34s |
| AMD Ryzen 5 3600 | 26s |
| Apple M1 Pro | 19s |
Optimization Features
- MerkCache: Keeps frequently accessed Merk trees in memory
- Batch Operations: Update multiple trees atomically in single transaction
- Cost Tracking: Fine-grained resource monitoring per tree operation
- Lazy Loading: Load only required nodes from Merk trees
- Prefix Iteration: Efficient traversal within subtrees
- Root Hash Propagation: Optimized upward hash updates through tree hierarchy
Documentation
Detailed Documentation
- Merk - Merkle AVL Tree
- Merk Deep Dive - Nodes, Proofs, and State
- Storage Abstraction Layer
- GroveDB Core
- Cost Tracking System
- Auxiliary Crates
Examples
See the examples directory for:
- Basic CRUD operations
- Secondary indexing patterns
- Reference usage
- Batch operations
- Proof generation and verification
Building from Source
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Clone repository
git clone https://github.com/dashevo/grovedb.git
cd grovedb
# Build
cargo build --release
# Run tests
cargo test
# Run benchmarks
cargo bench
Debug Visualization
GroveDB includes a web-based visualizer for debugging:
let db = Arc::new(GroveDb::open("db")?);
db.start_visualizer(10000); // Port 10000
// Visit http://localhost:10000 in your browser
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development Setup
- Fork the repository
- Create a feature branch
- Write tests for new functionality
- Ensure all tests pass
- Submit a pull request
Testing
# Run all tests
cargo test
# Run specific test
cargo test test_name
# Run with verbose output
cargo test -- --nocapture
License
GroveDB is licensed under the MIT License. See LICENSE for details.
Support
Acknowledgments
GroveDB implements groundbreaking concepts from cryptographic database research:
Academic Foundation
- Database Outsourcing with Hierarchical Authenticated Data Structures - The seminal work by Etemad & KΓΌpΓ§ΓΌ that introduced hierarchical authenticated data structures for efficient multi-index queries
- Merkle Trees - Ralph Merkle's foundational work on cryptographic hash trees
- AVL Trees - Adelson-Velsky and Landis's self-balancing binary search tree algorithm
Key Innovation
GroveDB realizes the vision of hierarchical authenticated data structures by implementing a forest of Merkle AVL trees (Merk), where each tree can contain other trees. This solves the fundamental limitation of flat authenticated structures - enabling efficient queries on any index while maintaining cryptographic proofs throughout the hierarchy.
Special thanks to the Dash Core Group and all contributors who have helped make this theoretical concept a production reality.
Dependencies
~21KB