13 releases
0.8.2 | Nov 8, 2021 |
---|---|
0.8.0 | Apr 8, 2021 |
0.7.4 | Mar 8, 2021 |
0.7.3 | Aug 11, 2020 |
0.5.4 | Jul 14, 2019 |
#624 in Encoding
589 downloads per month
Used in sonnerie
87KB
2K
SLoC
rust-shardio
Library for out-of-memory sorting of large datasets which need to be processed in multiple map / sort / reduce passes.
You write a stream of items of type T
implementing Serialize
and Deserialize
to a ShardWriter
. The items are buffered, sorted according to a customizable sort key, then serialized to disk in chunks with serde + lz4, while maintaining an index of the position and key range of each chunk. You use a ShardReader
to stream through a item in a selected interval of the key space, in sorted order.
See Docs for API and examples.
Note: Enable the 'full-test' feature in Release mode to turn on some long-running stress tests.
Dependencies
~2.3–3.5MB
~69K SLoC