2 unstable releases
0.2.0 | Feb 10, 2025 |
---|---|
0.1.0 | Feb 9, 2025 |
#464 in Data structures
242 downloads per month
75KB
1.5K
SLoC
ASTrie (Adaptive Segmented Trie)
A hybrid data structure that efficiently supports fast lookups, insertions, and range queries with an adaptive mechanism to balance memory usage and performance.
Key Features
1. Trie & B+ Tree Hybrid
- Uses trie-based prefix matching for fast key lookups.
- Segments the trie into B+ tree nodes when prefixes grow beyond a threshold, improving memory efficiency.
2. Adaptive Segmentation
- If a trie branch gets too deep, it automatically converts into a B+ tree to avoid memory fragmentation.
- If a B+ tree node becomes too sparse, it collapses back into a trie for optimized storage.
3. Parallelization with Rust’s Ownership Model
- Designed with lock-free concurrency using atomic operations (e.g., Arc<RwLock>).
- Parallel insertions & lookups without performance degradation.
4. Efficient Range Queries
- Unlike traditional tries, the B+ tree nodes allow ordered key traversal, making range queries (e.g., key-value databases) much more efficient.
5. Cache-aware Structure
- Designed to maximize CPU cache locality, reducing cache misses.
- Prefetching mechanisms to speed up queries.
Use Cases
- Key-Value Stores (faster and more memory-efficient alternative to HashMap)
- Network Routing Tables (efficient prefix-matching for IP lookups)
- Inverted Indexes for Search Engines
- Auto-complete systems (efficient prefix-based lookups)
- Time-series Databases (optimized range queries)
High-Level Algorithm
Core Data Structure
1. Hybrid Structure
- Root starts as a trie node
- Nodes can be either trie nodes or B+ tree nodes
- Dynamic conversion between types based on usage patterns
2. Node Types
enum NodeType {
Trie(TrieNode), // For sparse/prefix-heavy data
BTree(BTreeNode) // For dense data ranges
}
Core Operations
1. Insertion
insert(key, value):
1. Start at root node
2. For each node encountered:
If Trie Node:
- If depth > TRIE_DEPTH_THRESHOLD:
→ Convert to B+ tree (convert_to_btree)
- Otherwise:
→ Navigate using key bytes
→ Create child nodes as needed
If B+ Tree Node:
- If leaf node:
→ Insert key-value pair
→ If too full: split node
→ If too sparse: convert to trie (convert_to_trie)
- If internal node:
→ Navigate to correct child
2. Lookup
get(key):
1. Start at root node
2. For each node encountered:
If Trie Node:
→ Use key bytes to navigate
→ Return value if found at leaf
If B+ Tree Node:
→ Binary search for key
→ Navigate to child if internal node
→ Return value if found at leaf
3. Range Query
range(start_key, end_key):
1. Find path to start_key
2. For each node in path:
If Trie Node:
→ Collect all values in range via DFS
If B+ Tree Node:
→ Use B+ tree's natural ordering
→ Scan leaf nodes using links
Adaptive Mechanisms
1. Trie → B+ Tree Conversion
convert_to_btree(trie_node):
1. Recursively collect all key-value pairs
2. Sort pairs by key
3. Create B+ tree leaf node
4. If too many keys:
→ Split into multiple nodes
→ Create parent nodes as needed
2. B+ Tree → Trie Conversion
convert_to_trie(btree_node):
1. Create empty trie root
2. For each key-value pair:
→ Convert key to byte path
→ Create trie path
→ Store value at leaf
3. Balance if needed
Performance Characteristics
Space Complexity
- Trie: O(k*n) where k is key length
- B+ Tree: O(n) where n is number of items
- Adaptive: O(min(k*n, n)) depending on structure
Time Complexity
1. Insertions
- Best: O(1) for trie
- Worst: O(log n) for B+ tree
- Amortized: O(log n)
2. Lookups
- Best: O(k) for trie where k is key length
- Worst: O(log n) for B+ tree
- Average: O(min(k, log n))
3. Range Queries
- O(log n + m) where m is number of items in range