3 unstable releases
0.2.0 | Nov 2, 2023 |
---|---|
0.1.1 | Nov 1, 2023 |
0.1.0 | Nov 1, 2023 |
#1266 in Filesystem
10KB
read_chunks
This crate provides an extension to types implementing Read
that allows them to read data in large chunks until the end of the file, similar to how slice::chunks
works.
Licensing
This crate is dual licensed as MIT OR Apache-2.0
, this is to allow it to be compatible with the license of the rust standard library.
lib.rs
:
A crate that provides a read_chunks
extension to types implementing std::io::Read
(including unsized ones).
Motivation
Sometimes you may be reading a file to the end to do processing on it, but do not want the
entire file in memory. Sometimes bytes
is the answer to this, but if you wish to
process larger chunks of data at once, maybe for SIMD, that cannot be used.
Calling into read
repeatedly to get a chunk until the end is tedious as it may return
significantly less bytes than you expected (slowing down bulk processing), or encounter a recoverable
error, and handling that yourself is a chore.
A more correct implementation may be to use read_exact
for that purpose, as it
guarantees the whole chunk. The problem with that is at the end of the file you will lose the
data that was read, as read_exact
leaves the buffer contents unspecified at EOF.
The method implemented in this crate addresses both problems, it guarantees the full buffer size
requested whenever it can, handles recoverable errors, and at the end of the Read stream will
simply return a final smaller buffer before returning None
to signal the end of the stream was
detected. That is to say, you will always get the full buffer length until the last chunk where
you get a tail, similar to slice::chunks
, but for a Read
.
Usage
Simply add use read_chunks::ReadExt;
to your module and use the new read_chunks
method that
should appear on any type implementing Read
.
Standard Library Inclusion
This crate was written because it is useful to me for hashing files incrementally with SIMD optimized hashing functions like blake3. This crate may attempt to be added to the rust standard library if it is seen as generally useful and people agree with the design. For this reason, the api may break in order to prototype what could work best for the standard library.
In particular, a read_chunks_exact
api may be desirable that mirrors slice::chunks_exact
,
giving the remainder in a separate function, and asserting the length of the buffer in the main
iterator remains constant,
It may also be put into question if the return type should be &[u8]
or &mut [u8]
. Currently
a &mut [u8]
is returned because that is allowed by the implementation, but whether that makes
sense as an api is unknown at this time.