7 releases
new 0.2.5 | Nov 16, 2024 |
---|---|
0.2.4 | Nov 16, 2024 |
0.1.0 | Nov 16, 2024 |
#307 in Data structures
388 downloads per month
31KB
227 lines
Chunk
A Rust implementation of a persistent data structure that provides O(1) append and concatenation operations through structural sharing.
Features
- O(1) Append Operations: Add elements to your chunk in constant time
- O(1) Concatenation: Combine two chunks efficiently
- Immutable/Persistent: All operations create new versions while preserving the original
- Memory Efficient: Uses structural sharing via reference counting
- Safe Rust: Implemented using 100% safe Rust
Theoretical Background
This implementation is inspired by the concepts presented in Hinze and Paterson's work on Finger Trees[^1], though simplified for our specific use case. While our implementation differs in structure, it shares similar performance goals and theoretical foundations.
Relationship to Finger Trees
Finger Trees are a functional data structure that supports:
- Access to both ends in amortized constant time
- Concatenation in logarithmic time
- Persistence through structural sharing
Our Chunk
implementation achieves similar goals through a simplified approach:
- We use
Append
nodes for constant-time additions - The
Concat
variant enables efficient concatenation Rc
(Reference Counting) provides persistence and structural sharing
Like Finger Trees, our structure can be viewed as an extension of Okasaki's implicit deques[^2], but optimized for our specific use cases. While Finger Trees offer a more general-purpose solution with additional capabilities, our implementation focuses on providing:
- Simpler implementation
- More straightforward mental model
- Specialized performance characteristics for append/concat operations
Performance Trade-offs
While Finger Trees achieve logarithmic time for concatenation, our implementation optimizes for constant-time operations through lazy evaluation. This means:
- Append and concatenation are always O(1)
- The cost is deferred to when we need to materialize the sequence (via
as_vec()
) - Memory usage grows with the number of operations until materialization
This trade-off is particularly beneficial in scenarios where:
- Multiple transformations are chained
- Not all elements need to be materialized
- Structural sharing can be leveraged across operations
Installation
Add this to your Cargo.toml
:
[dependencies]
tailcall-chunk = "0.1.0"
Quick Start
use chunk::Chunk;
// Create a new chunk and append some elements
let chunk1 = Chunk::default()
.append(1)
.append(2);
// Create another chunk
let chunk2 = Chunk::default()
.append(3)
.append(4);
// Concatenate chunks in O(1) time
let combined = chunk1.concat(chunk2);
// Convert to vector when needed
assert_eq!(combined.as_vec(), vec![1, 2, 3, 4]);
Detailed Usage
Working with Custom Types
use chunk::Chunk;
#[derive(Debug, PartialEq)]
struct Person {
name: String,
age: u32,
}
let people = Chunk::default()
.append(Person {
name: "Alice".to_string(),
age: 30
})
.append(Person {
name: "Bob".to_string(),
age: 25
});
// Access elements
let people_vec = people.as_vec();
assert_eq!(people_vec[0].name, "Alice");
assert_eq!(people_vec[1].name, "Bob");
Memory Efficiency
The Chunk
type uses structural sharing through reference counting (Rc
), which means:
- Appending or concatenating chunks doesn't copy the existing elements
- Memory is automatically freed when no references remain
- Multiple versions of the data structure can coexist efficiently
use chunk::Chunk;
let original = Chunk::default().append(1).append(2);
let version1 = original.clone().append(3); // Efficient cloning
let version2 = original.clone().append(4); // Both versions share data
Performance Characteristics
Operation | Time Complexity | Space Complexity |
---|---|---|
new() |
O(1) | O(1) |
append() |
O(1) | O(1) |
concat() |
O(1) | O(1) |
transform() |
O(1) | O(1) |
transform_flatten() |
O(1) | O(1) |
as_vec() |
O(n) | O(n) |
clone() |
O(1) | O(1) |
Implementation Details
The Chunk<A>
type is implemented as an enum with four variants:
Empty
: Represents an empty chunkAppend
: Represents a single element appended to another chunkConcat
: Represents the concatenation of two chunksTransformFlatten
: Represents a lazy transformation and flattening of elements
The data structure achieves its performance characteristics through:
- Structural sharing using
Rc
- Lazy evaluation of concatenation and transformations
- Immutable operations that preserve previous versions
Contributing
We welcome contributions! Here's how you can help:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -am 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Please make sure to:
- Update documentation
- Add tests for new features
- Follow the existing code style
- Update the README.md if needed
License
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
References
[^1]: Ralf Hinze and Ross Paterson. "Finger Trees: A Simple General-purpose Data Structure", Journal of Functional Programming 16(2):197-217, 2006. [^2]: Chris Okasaki. "Purely Functional Data Structures", Cambridge University Press, 1998.