#file-storage #version-control #single-file #data-file #backup-file #delta #control-system

weave

Weave delta file storage. Inspired by the storage format of SCCS, this crate allows multiple revisions of a file to be stored efficiently in a single file.

5 unstable releases

0.3.1 Feb 16, 2022
0.3.0 Mar 21, 2021
0.2.2 Sep 8, 2019
0.2.0 Mar 7, 2018
0.1.0 Aug 6, 2017

#1059 in Filesystem

27 downloads per month
Used in rsure

MIT license

38KB
734 lines

Weave File Support

Testing

Many of the tests compare the crates output with that generated by the sccs command. On many Linux distros, a compatible version can be found in the cssc package.


lib.rs:

Weave deltas, inspired by SCCS.

The SCCS revision control system is one of the oldest source code management systems (1973). Although many of its concepts are quite dated in these days of git, the underlying "weave" delta format it used turns out to be a good way of representing multiple versions of data that differ only in parts.

This package implements a weave-based storage of "plain text", where plain text consists of lines of UTF-8 printable characters separated by a newline.

The format is similar to SCCS, but with no constraints to keep what are relatively poor design decisions from SCCS, such as putting a checksum at the top of the file, and using limited-sized field for values such as the number of lines in a file, or the use of 2-digit years. However, the main body of the weaved file, that which describes inserts and deletes is the same, and allows us to test this version by comparing with the storage of sccs.

Weave files are written using NewWeave, which works like a regular file writer. The file itself has a small amount of surrounding metadata, but is otherwise mostly just the contents of the initial file.

Adding a delta to a weave file is done with the DeltaWriter. This is also written to, as a regular file, and then DeltaWriter::close method will extract a base revision and use the diff command to write a new version of the weave. The close method will make several temporary files in the process.

The weave data is stored using a NamingConvention, a trait that manages a related collection of files, and temp files. SimpleNaming is a basic representation of this that has a base name, a backup file, and some temporary files. The data in the file can be compressed.

Dependencies

~4–6.5MB
~115K SLoC