5 releases
0.3.1 | Apr 14, 2024 |
---|---|
0.3.0 | Apr 13, 2024 |
0.2.3 | Apr 10, 2024 |
0.2.2 | Apr 10, 2024 |
0.2.1 | Apr 10, 2024 |
#607 in Text processing
183 downloads per month
36KB
477 lines
🖐️ palmdoc-compression
This is a fast, safe, and correct implementation of PalmDoc-flavored LZ77 compression (primarily used by Amazon ebook formats). Compression is 300-400x faster than Calibre's implementation with a comparable compression ratio.
This crate also includes Calibre's version for comparison and usage if desired, gated behind the calibre
feature.
Usage
use palmdoc_compression::{compress, decompress};
let data = b"hello world";
let compressed = compress(data);
let decompressed = decompress(&compressed).unwrap();
assert_eq!(data, decompressed);
⚡ Benchmarks
MOBI/AZW files are split into 4KB chunks, so benchmarks here also use 4KB chunks. Benchmarks were run on a M1 Max.
For a 4KB chunk of lorem ipsum text:
Decompression | Compression | |
---|---|---|
Calibre | 922 MiB/s | 252 KiB/s |
palmdoc-compression | 797 MiB/s | 109 MiB/s |
For a random 4KB chunk of War and Peace from Project Gutenberg:
Decompression | Compression | |
---|---|---|
Calibre | 1011 MiB/s | 336 KiB/s |
palmdoc-compression | 876 MiB/s | 103 MiB/s |
(Reproduce with cargo bench --features calibre
.)
Compression ratio
Ratios calculated by compressing War and Peace from Project Gutenberg in 4KB chunks.
ratio, ⬇️ is better | |
---|---|
calibre | 0.56% (theoretical max) |
palmdoc-compression | 0.57% |
(Reproduce with cargo run --example ratios --release --features calibre
.)
Credits
- LPeter1997 for a clear and understandable Rust LZ77 implementation with hash chains
- Calibre for a reference implementation with tests
Dependencies
~0.2–1MB
~23K SLoC