4 releases
Uses new Rust 2024
| 0.1.3 | Mar 21, 2026 |
|---|---|
| 0.1.2 | Mar 17, 2026 |
| 0.1.1 | Mar 11, 2026 |
| 0.1.0 | Feb 23, 2026 |
#1862 in Parser implementations
Used in 4 crates
1MB
19K
SLoC
OxiGDAL Zarr
Pure Rust implementation of the Zarr v2/v3 storage specification for cloud-optimized, chunked, N-dimensional arrays. Support for multiple storage backends (filesystem, S3, HTTP, memory) and compression codecs (Zstd, Gzip, LZ4). Part of the OxiGDAL geospatial data access library.
Features
- Zarr Versions: Full support for Zarr v2 and v3 specifications
- Storage Backends: Filesystem, S3-compatible, HTTP, and in-memory storage
- Compression Codecs: Zstd, Gzip, LZ4, and Blosc filters
- Data Filters: Shuffle, Delta, and Scale-offset filters for preprocessing
- Async I/O: Async support for cloud storage backends with Tokio runtime
- Parallel Operations: Multi-threaded chunk processing with Rayon
- LRU Caching: Optional chunk caching for frequent access patterns
- Consolidation: Metadata consolidation for optimized access
- Sharding: Zarr v3 sharding extension for improved access patterns
- Pure Rust: 100% Pure Rust with no C/Fortran dependencies
- No Unwrap Policy: All fallible operations return
Result<T, E>with descriptive errors
Installation
Add to your Cargo.toml:
[dependencies]
oxigdal-zarr = "0.1.0"
Features
Enable optional features for specific capabilities:
[dependencies]
# Filesystem support
oxigdal-zarr = { version = "0.1.0", features = ["filesystem"] }
# Cloud storage with S3
oxigdal-zarr = { version = "0.1.0", features = ["s3", "async"] }
# HTTP remote access
oxigdal-zarr = { version = "0.1.0", features = ["http"] }
# Compression codecs
oxigdal-zarr = { version = "0.1.0", features = ["zstd", "gzip", "lz4"] }
# Parallel processing
oxigdal-zarr = { version = "0.1.0", features = ["parallel"] }
# Chunk caching
oxigdal-zarr = { version = "0.1.0", features = ["cache"] }
# All features
oxigdal-zarr = { version = "0.1.0", features = ["filesystem", "s3", "http", "async", "zstd", "gzip", "lz4", "shuffle", "delta", "scale-offset", "parallel", "cache", "v2", "v3"] }
Quick Start
Reading a Zarr Array
use oxigdal_zarr::{ZarrReader, FilesystemStore};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open a Zarr v2 array from filesystem
let store = FilesystemStore::open("data.zarr")?;
let reader = ZarrReader::open_v2(store)?;
// Get array metadata
println!("Shape: {:?}", reader.shape());
println!("Chunks: {:?}", reader.chunks());
println!("Data type: {:?}", reader.dtype());
// Read a specific chunk
let chunk_coords = vec![0, 0, 0];
let chunk_data = reader.read_chunk(&chunk_coords)?;
println!("Chunk size: {} bytes", chunk_data.len());
Ok(())
}
Writing a Zarr Array
use oxigdal_zarr::{ZarrWriter, FilesystemStore};
use oxigdal_zarr::metadata::v2::{ArrayMetadataV2, DType};
use oxigdal_zarr::codecs::Compressor;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a new Zarr v2 array on filesystem
let store = FilesystemStore::create("output.zarr")?;
let metadata = ArrayMetadataV2 {
shape: vec![100, 200, 300],
chunks: vec![10, 20, 30],
dtype: DType::Float32,
compressor: Some(Compressor::Zstd { level: 3 }),
fill_value: 0.0,
order: 'C',
filters: None,
};
let mut writer = ZarrWriter::create_v2(store, metadata)?;
// Write a chunk of data
let chunk_coords = vec![0, 0, 0];
let chunk_data: Vec<f32> = vec![0.0; 10 * 20 * 30];
writer.write_chunk(&chunk_coords, &chunk_data)?;
writer.finalize()?;
Ok(())
}
Usage
Storage Backends
Filesystem Storage
Access Zarr arrays stored on local or network filesystems:
use oxigdal_zarr::FilesystemStore;
// Open existing array
let store = FilesystemStore::open("/path/to/array.zarr")?;
// Create new array
let store = FilesystemStore::create("/path/to/new.zarr")?;
S3-Compatible Storage
Access arrays stored in cloud object storage (requires s3 feature):
use oxigdal_zarr::S3Storage;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let store = S3Storage::new("my-bucket", "path/to/array.zarr").await?;
// Use store with async operations
Ok(())
}
HTTP Access
Read arrays from HTTP/HTTPS URLs (requires http feature):
use oxigdal_zarr::HttpStorage;
let store = HttpStorage::new("https://example.com/data.zarr")?;
let reader = ZarrReader::open_v3(store)?;
In-Memory Storage
Perfect for testing and temporary data:
use oxigdal_zarr::MemoryStore;
let store = MemoryStore::new();
// Use like any other store
Compression Codecs
The library supports multiple compression algorithms:
Zstd (Default)
Fast, high-compression ratio. Recommended for most use cases:
use oxigdal_zarr::codecs::Compressor;
let compressor = Compressor::Zstd { level: 3 };
Gzip
Standard deflate compression, widely compatible:
let compressor = Compressor::Gzip { level: 6 };
LZ4
Very fast compression, lower ratio:
let compressor = Compressor::LZ4 { acceleration: 1 };
Chunk Management
Reading Chunks
use oxigdal_zarr::ZarrReader;
let reader = /* ... */;
// Read specific chunk by coordinates
let chunk = reader.read_chunk(&[0, 0])?;
// Read slice across multiple chunks
let slice = reader.read_slice(&[0..100, 50..150])?;
Chunk Grid Configuration
use oxigdal_zarr::ChunkGrid;
// Regular grid (same chunk size everywhere)
let grid = ChunkGrid::regular(vec![64, 64, 64])?;
// Irregular grid with different chunk sizes
let grid = ChunkGrid::irregular(chunk_shapes)?;
Advanced Usage - Parallel Processing
Enable parallel chunk processing for bulk operations (requires parallel feature):
use oxigdal_zarr::{ZarrWriter, FilesystemStore};
let store = FilesystemStore::create("output.zarr")?;
let mut writer = ZarrWriter::create_v2(store, metadata)?;
// Parallel chunk writing (internally uses Rayon)
// Multiple chunks can be written in parallel
Advanced Usage - Caching
Enable LRU caching to improve performance for repeated chunk access:
use oxigdal_zarr::{CachingStorage, FilesystemStore};
let inner_store = FilesystemStore::open("data.zarr")?;
let cached_store = CachingStorage::new(inner_store, 100)?; // Cache 100 chunks
Advanced Usage - Metadata Consolidation
For datasets with many arrays, consolidate metadata for faster access:
use oxigdal_zarr::consolidate_metadata;
let store = FilesystemStore::open("group.zarr")?;
let consolidated = consolidate_metadata(&store)?;
Advanced Usage - Zarr v3
Use the latest Zarr v3 specification with enhanced features:
use oxigdal_zarr::ZarrV3Reader;
let store = FilesystemStore::open("data.zarr")?;
let reader = ZarrV3Reader::open(store)?;
Error Handling
All operations return Result<T, ZarrError> following the "no unwrap" policy:
use oxigdal_zarr::{ZarrError, Result};
fn safe_operation() -> Result<()> {
let reader = /* ... */;
let chunk = reader.read_chunk(&[0, 0])?;
Ok(())
}
fn with_error_handling() -> std::result::Result<(), String> {
safe_operation().map_err(|e| match e {
ZarrError::Io(io_err) => format!("I/O error: {}", io_err),
ZarrError::Metadata(meta_err) => format!("Metadata error: {}", meta_err),
ZarrError::Codec(codec_err) => format!("Codec error: {}", codec_err),
_ => format!("Other error: {}", e),
})
}
API Overview
| Module | Description |
|---|---|
reader |
Reading Zarr arrays (v2, v3) |
writer |
Writing Zarr arrays (v2, v3) |
storage |
Backend storage implementations |
metadata |
Array and group metadata structures |
codecs |
Compression and encoding implementations |
filters |
Data filtering (shuffle, delta, scale-offset) |
chunk |
Chunk management and grid definitions |
dimension |
Dimension and shape utilities |
consolidation |
Metadata consolidation for groups |
sharding |
Zarr v3 sharding extension |
error |
Error types and conversions |
Examples
See the examples directory for complete working examples:
- create_test_zarr_samples: Generate realistic geospatial Zarr datasets for demonstration
Run examples with:
# Create sample Zarr datasets
cargo run --example create_test_zarr_samples --features="v3,zstd,filesystem"
Performance Characteristics
OxiGDAL Zarr is optimized for cloud-native geospatial data access:
- Chunk Size: Optimal performance with 64-256 MB chunks
- Compression: Zstd provides best compression/speed ratio
- Parallel I/O: ~N× speedup with N parallel chunk operations
- Caching: LRU cache provides 10-100× speedup for repeated access
Benchmark on modern hardware (RTX 4090, 1TB NVMe):
- Sequential read: ~5 GB/s
- Random chunk access: ~200K chunks/s
- Compression overhead: ~10-20% for Zstd level 3
Pure Rust
This library is 100% Pure Rust with no C/Fortran dependencies:
- All compression algorithms use pure Rust implementations (flate2, zstd-rs)
- No external system libraries required
- Cross-platform: Linux, macOS, Windows, WASM, embedded systems
- Safe by default: No unsafe code except where explicitly required for performance
Zarr Format Support
Zarr v2
Full compliance with Zarr v2 specification:
- Array metadata (
.zarray) - Group metadata (
.zgroup) - All standard codecs
- Attributes and custom metadata
Zarr v3
Early support for Zarr v3 specification:
- Enhanced metadata format
- Sharding extension
- Flexible codec pipelines
- Improved interoperability
Documentation
- API Docs: Full API documentation at docs.rs
- Zarr Spec: Official Zarr specification
- OxiGDAL Docs: OxiGDAL documentation
- Examples: See examples directory
Integration with OxiGDAL Ecosystem
OxiGDAL Zarr is fully integrated with the OxiGDAL geospatial library:
use oxigdal::Dataset;
use oxigdal_zarr::FilesystemStore;
// Convert between formats
let dataset = Dataset::open("data.tif")?;
// ... process ...
// Export to Zarr
Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch
- Follow COOLJAPAN policies:
- No
unwrap()calls in production code - 100% Pure Rust (use feature gates for C dependencies)
- Comprehensive error handling
- File size limit of 2000 lines (use
splitrsfor refactoring)
- No
- Write tests and examples
- Run
cargo test --all-featuresandcargo clippy - Submit a pull request
License
Licensed under the Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0).
Related Projects
- OxiGDAL - Geospatial data access library
- OxiGDAL GeoTIFF - Cloud-optimized GeoTIFF driver
- OxiGDAL NetCDF - NetCDF data format support
- OxiGDAL GeoParquet - GeoParquet columnar format
- GDAL - Geospatial Data Abstraction Library (reference implementation)
- Zarr - Official Zarr specification
COOLJAPAN Ecosystem
OxiGDAL Zarr is part of the COOLJAPAN ecosystem of pure Rust libraries:
- OxiGDAL: Geospatial data access
- OxiBLAS: Pure Rust BLAS operations
- OxiFFT: Fast Fourier Transform (replaces rustfft)
- OxiCode: Serialization (bincode alternative)
- SciRS2: Scientific computing primitives
- NumRS2: Numerical computing (NumPy-like)
Maintained by: COOLJAPAN OU (Team Kitasan) Repository: https://github.com/cool-japan/oxigdal Issue Tracker: https://github.com/cool-japan/oxigdal/issues Discussions: https://github.com/cool-japan/oxigdal/discussions
Dependencies
~6–32MB
~347K SLoC