13 releases

0.8.2 Nov 8, 2021
0.8.0 Apr 8, 2021
0.7.4 Mar 8, 2021
0.7.3 Aug 11, 2020
0.5.4 Jul 14, 2019

#624 in Encoding

Download history 7213/week @ 2024-07-20 6292/week @ 2024-07-27 5219/week @ 2024-08-03 5074/week @ 2024-08-10 6777/week @ 2024-08-17 2223/week @ 2024-08-24 497/week @ 2024-08-31 222/week @ 2024-09-07 186/week @ 2024-09-14 196/week @ 2024-09-21 139/week @ 2024-09-28 196/week @ 2024-10-05 146/week @ 2024-10-12 102/week @ 2024-10-19 151/week @ 2024-10-26 176/week @ 2024-11-02

589 downloads per month
Used in sonnerie

MIT license

87KB
2K SLoC

rust-shardio

Crates.io Downloads Crates.io Version Crates.io License Build Status Coverage Status API Docs

Library for out-of-memory sorting of large datasets which need to be processed in multiple map / sort / reduce passes.

You write a stream of items of type T implementing Serialize and Deserialize to a ShardWriter. The items are buffered, sorted according to a customizable sort key, then serialized to disk in chunks with serde + lz4, while maintaining an index of the position and key range of each chunk. You use a ShardReader to stream through a item in a selected interval of the key space, in sorted order.

See Docs for API and examples.

Note: Enable the 'full-test' feature in Release mode to turn on some long-running stress tests.

Dependencies

~2.3–3.5MB
~69K SLoC