#read #extension #io-read #io #chunks #traits #reading

read_chunks

An extension to the Read trait allowing easier chunked reading

3 unstable releases

0.2.0 Nov 2, 2023
0.1.1 Nov 1, 2023
0.1.0 Nov 1, 2023

#1714 in Filesystem

27 downloads per month

MIT/Apache

10KB

read_chunks

This crate provides an extension to types implementing Read that allows them to read data in large chunks until the end of the file, similar to how slice::chunks works.

Licensing

This crate is dual licensed as MIT OR Apache-2.0, this is to allow it to be compatible with the license of the rust standard library.


lib.rs:

A crate that provides a read_chunks extension to types implementing std::io::Read (including unsized ones).

Motivation

Sometimes you may be reading a file to the end to do processing on it, but do not want the entire file in memory. Sometimes bytes is the answer to this, but if you wish to process larger chunks of data at once, maybe for SIMD, that cannot be used.

Calling into read repeatedly to get a chunk until the end is tedious as it may return significantly less bytes than you expected (slowing down bulk processing), or encounter a recoverable error, and handling that yourself is a chore.

A more correct implementation may be to use read_exact for that purpose, as it guarantees the whole chunk. The problem with that is at the end of the file you will lose the data that was read, as read_exact leaves the buffer contents unspecified at EOF.

The method implemented in this crate addresses both problems, it guarantees the full buffer size requested whenever it can, handles recoverable errors, and at the end of the Read stream will simply return a final smaller buffer before returning None to signal the end of the stream was detected. That is to say, you will always get the full buffer length until the last chunk where you get a tail, similar to slice::chunks, but for a Read.

Usage

Simply add use read_chunks::ReadExt; to your module and use the new read_chunks method that should appear on any type implementing Read.

Standard Library Inclusion

This crate was written because it is useful to me for hashing files incrementally with SIMD optimized hashing functions like blake3. This crate may attempt to be added to the rust standard library if it is seen as generally useful and people agree with the design. For this reason, the api may break in order to prototype what could work best for the standard library.

In particular, a read_chunks_exact api may be desirable that mirrors slice::chunks_exact, giving the remainder in a separate function, and asserting the length of the buffer in the main iterator remains constant,

It may also be put into question if the return type should be &[u8] or &mut [u8]. Currently a &mut [u8] is returned because that is allowed by the implementation, but whether that makes sense as an api is unknown at this time.

No runtime deps