3 releases
| 0.1.2 | Jan 5, 2026 |
|---|---|
| 0.1.1 | Jan 4, 2026 |
| 0.1.0 | Jan 4, 2026 |
#148 in Compression
479 downloads per month
Used in batuta
185KB
4K
SLoC
copia
Pure Rust rsync-style file synchronization library
Why copia?
- Embeddable: Use rsync's delta-transfer algorithm as a library, not a subprocess
- Pure Rust: 100% safe Rust, no unsafe code, fully auditable
- Zero C Dependencies: No OpenSSL, no librsync, no external binaries
- Async Support: First-class tokio integration for non-blocking I/O
- Memory Safe: No buffer overflows, no use-after-free, guaranteed by Rust
Performance
┌────────────────────────────┬────────────┬────────────┬──────────────────┐
│ Scenario │ rsync (ms) │ copia (ms) │ Result │
├────────────────────────────┼────────────┼────────────┼──────────────────┤
│ 1KB identical │ 43.55 │ 0.05 │ Library wins │
│ 100KB identical │ 43.23 │ 0.12 │ Library wins │
│ 1MB identical │ 43.40 │ 0.33 │ Library wins │
│ 1MB 5% changed │ 44.72 │ 4.54 │ Library wins │
│ 10MB identical │ 43.68 │ 3.92 │ Library wins │
│ 10MB 1% changed │ 46.91 │ 43.05 │ Comparable │
│ 10MB 100% different │ 52.84 │ 43.88 │ Comparable │
└────────────────────────────┴────────────┴────────────┴──────────────────┘
⚠️ IMPORTANT: rsync times include ~40ms process spawn overhead.
This benchmark compares copia as a library vs rsync as a subprocess.
For embedded/library use cases, copia avoids this overhead entirely.
For CLI-to-CLI comparison, performance is comparable on large files.
When copia shines:
- Embedded in applications (no process spawn overhead)
- High-frequency sync operations (amortize startup cost)
- Small file synchronization (overhead dominates)
- When you need async I/O or Rust integration
When rsync is fine:
- One-off large file transfers (spawn overhead negligible)
- Shell scripts and CLI workflows
- When you need rsync's full feature set (permissions, links, etc.)
Installation
Add to your Cargo.toml:
[dependencies]
copia = "0.1"
For async support:
[dependencies]
copia = { version = "0.1", features = ["async"] }
CLI Installation
cargo install copia --features cli
Quick Start
Library Usage
use copia::{CopiaSync, Sync};
use std::io::Cursor;
// Create sync engine
let sync = CopiaSync::with_block_size(2048);
// Generate signature from basis (old) file
let basis = b"original file content here";
let signature = sync.signature(Cursor::new(basis.as_slice()))?;
// Compute delta from source (new) file
let source = b"modified file content here";
let delta = sync.delta(Cursor::new(source.as_slice()), &signature)?;
// Apply delta to reconstruct the new file
let mut output = Vec::new();
sync.patch(Cursor::new(basis.as_slice()), &delta, &mut output)?;
assert_eq!(output, source);
Async Usage
use copia::async_sync::AsyncCopiaSync;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let sync = AsyncCopiaSync::with_block_size(2048);
// Sync source file to destination
let result = sync.sync_files("source.txt", "dest.txt").await?;
println!("Matched: {} bytes", result.bytes_matched);
println!("Literal: {} bytes", result.bytes_literal);
println!("Compression: {:.1}%", result.compression_ratio() * 100.0);
Ok(())
}
CLI Usage
# Sync a file
copia sync source.txt dest.txt
# Generate signature
copia signature file.txt -o file.sig
# Compute delta
copia delta newfile.txt file.sig -o file.delta
# Apply patch
copia patch oldfile.txt file.delta -o newfile.txt
How It Works
Copia implements the rsync delta-transfer algorithm:
-
Signature Generation: The basis file is divided into fixed-size blocks. For each block, a rolling checksum (Adler-32 variant) and strong hash (BLAKE3) are computed.
-
Delta Computation: The source file is scanned with a sliding window. When the rolling checksum matches a known block, the strong hash verifies the match. Matching blocks become "copy" operations; non-matching data becomes "literal" operations.
-
Patch Application: The delta is applied to the basis file, copying matched blocks and inserting literal data to reconstruct the source.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Basis File │────▶│ Signature │ │ Source File │
└─────────────┘ └──────┬──────┘ └──────┬──────┘
│ │
▼ ▼
┌──────────────────────────┐
│ Delta Computation │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ Delta: [Copy, Literal..] │
└────────────┬─────────────┘
│
┌─────────────┐ │
│ Basis File │────────┤
└─────────────┘ ▼
┌──────────────────────────┐
│ Patch Application │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ Reconstructed Source │
└──────────────────────────┘
Implementation Details
| Component | Implementation |
|---|---|
| Rolling Checksum | Adler-32 variant with lazy modulo (normalize every 5000 rolls) |
| Strong Hash | BLAKE3 (32 bytes, cryptographic) |
| Hash Table | FxHashMap for fast u32 key lookups |
| Parallelism | Rayon for multi-core signature generation |
API Reference
Core Types
CopiaSync- Main synchronization engineSignature- Block signatures for a fileDelta- Difference between two filesRollingChecksum- Adler-32 variant rolling checksumStrongHash- BLAKE3 cryptographic hash
Async Types
AsyncCopiaSync- Async synchronization engineSyncResult- Statistics from sync operation
Feature Flags
| Feature | Description |
|---|---|
async |
Enable tokio async support |
cli |
Build command-line interface |
Benchmarks
Run benchmarks yourself:
# Compare against rsync (note: includes process spawn overhead)
cargo bench --bench rsync_comparison --features async
# Run criterion benchmarks (algorithm-only, no spawn overhead)
cargo bench --bench benchmarks
Comparison with rsync
| Feature | copia | rsync |
|---|---|---|
| Language | Pure Rust | C |
| Memory Safety | Guaranteed | Manual |
| Use as Library | Native | Subprocess only |
| Async I/O | Native | No |
| Process Overhead | None | ~40ms spawn |
| Permissions/ACLs | Not yet | Yes |
| Symbolic Links | Not yet | Yes |
| Compression | Not yet | Yes (zlib) |
License
MIT License - see LICENSE for details.
Contributing
Contributions welcome! Please read our contributing guidelines and submit PRs to the main branch.
Acknowledgments
- rsync algorithm by Andrew Tridgell and Paul Mackerras
- BLAKE3 team for the fast cryptographic hash
- Rust community for excellent tooling
Dependencies
~3–7MB
~141K SLoC