3 unstable releases

0.2.0 Nov 4, 2024
0.1.1 Oct 12, 2024
0.1.0 Oct 12, 2024

#89 in Simulation

Download history 248/week @ 2024-10-07 69/week @ 2024-10-14 158/week @ 2024-11-04 9/week @ 2024-11-11 8/week @ 2024-11-18 7/week @ 2024-11-25

182 downloads per month

MIT/Apache

32KB
606 lines

Head-prunable file

Normal files can not be pruned(truncated) from the beginning to some middle position. A HPFile use a sequence of small files to simulate one big virtual file. Thus, pruning from the beginning is to delete the first several small files.

A HPFile can only be read and appended. Any byteslice which was written to it is immutable.

To append a new byteslice into a HPFile, use the append function, which will return the start position of this byteslice. Later, just pass this start position to read_at for reading this byteslice out. The position passed to read_at must be the beginning of a byteslice that was written before, instead of its middle. Do NOT try to read the later half (from a middle point to the end) of a byteslice.

A HPFile can also be truncated: discarding the content from a given position to the end of the file. During trucation, several small files may be removed and one small file may get truncated.

A HPFile can serve many reader threads. If a reader thread just read random positions, plain read_at is enough. If a reader tends to read many adjacent byteslices in sequence, it can take advantage of spatial locality by using read_at_with_pre_reader, which uses a PreReader to read large chunks of data from file and cache them. Each reader thread can have its own PreReader. A PreReader cannot be shared by different HPFiles.

A HPFile can serve only one writer thread. The writer thread must own a write buffer that collects small pieces of written data into one big single write to the underlying OS file, to avoid the cost of many syscalls writing the OS file. This write buffer must be provided when calling append and flush. It is owned by the writer thead, instead of HPFile, because we want HPFile to be shared between many reader threads.

TempDir is used in unit test. It is a temporary directory created during a unit test function, and will be deleted when this test function exits.

Dependencies

~1.2–5.5MB
~24K SLoC