13 releases
| 0.8.2 | Nov 8, 2021 |
|---|---|
| 0.8.0 | Apr 8, 2021 |
| 0.7.4 | Mar 8, 2021 |
| 0.7.3 | Aug 11, 2020 |
| 0.5.4 | Jul 14, 2019 |
#779 in Encoding
551 downloads per month
Used in sonnerie
87KB
2K
SLoC
rust-shardio
Library for out-of-memory sorting of large datasets which need to be processed in multiple map / sort / reduce passes.
You write a stream of items of type T implementing Serialize and Deserialize to a ShardWriter. The items are buffered, sorted according to a customizable sort key, then serialized to disk in chunks with serde + lz4, while maintaining an index of the position and key range of each chunk. You use a ShardReader to stream through a item in a selected interval of the key space, in sorted order.
See Docs for API and examples.
Note: Enable the 'full-test' feature in Release mode to turn on some long-running stress tests.
Dependencies
~2.3–3.5MB
~66K SLoC