1 unstable release
new 0.5.0 | Dec 21, 2024 |
---|---|
0.4.0 |
|
0.3.0 |
|
0.2.0 |
|
0.1.0 |
|
#370 in Testing
72 downloads per month
52KB
1K
SLoC
buf_read_write
This crate contains a a buffered io implementation, a combination of std::io::BufReader
and std::io::BufWriter
. This allows buffered input and output, for example with a std::fs::File
.
The difference between this and regular BufWriter and BufReader is that a single instance can buffer both reads and writes to the same backing object (for example a File).
This crate is meant to be used with file-like objects, and requires the backing implementation
to implement std::io::Seek
.
Advantages:
- Seeking is supported (and required of the backing IO object), and does not invalidate the buffer.
- Reads and writes share a single buffer. Reads can be satisfied from uncommitted writes.
- No unsafe code
- Extensive test suite
- Custom random chaos testing
- Exhaustive testing of all 4-item sequences of operations read, write and seek with limited offsets/sizes.
- Cargo mutants testing
- Performance overhead within 50% overhead of regular BufReader/BufWriter, for small reads/writes.
- Documented
Use cases:
- Manipulating data in on-disk formats, without doing unnecessary IO operations
- General random access IO on files or file-like objects.
Limitations:
- buf_read_write is not a general disk caching library. The buffer is always contiguous.
How to avoid unnecessary IO
Let's say you have a large data file stored on disk. The file contains many chunks, each with its own header.
You wish to traverse the file, and rewrite a few fields in each header.
You can do the following sequence of operations:
- Seek to the first header
- Read this header entirely into memory
- Seek to a sub-field in the header
- Write this particular field (perhaps only a few bytes)
- Flush the BufStream.
Only one IO-operation will be emitted (this is of course the point of buffered IO). Note however, that this is true even if the written sub-fields are not adjacent - as long as they fit within the buffer size.
If you do not first read the file, and then write non-contiguous regions, there will be an IO operation for
every such non-contiguous region. The reason is that buf_read_write
does not support non-contiguous buffers,
and thus cannot represent a non-contiguous set of updates.
Note, a consequence of this is that the number of bytes actually written can be (much) larger than the number of actually modified bytes. If this is undesirable, make sure to call 'flush' after every write.