4 stable releases

new 1.2.0 May 2, 2025
1.1.1 May 12, 2024
1.1.0 May 11, 2024
1.0.0 May 10, 2024

#109 in Compression

Download history 8/week @ 2025-02-07 21/week @ 2025-02-14 2/week @ 2025-02-21 6/week @ 2025-02-28

215 downloads per month

MIT license

130KB
2.5K SLoC

Exaf

The EXtensible Archiver Format describes an archive file format for compressing and archiving files. It offers an alternative to the well-known zip and 7-zip formats, with extensibility in mind. The running time of this reference implementation is similar to that of GNU tar with Zstandard compression, and the resulting file size is very similar. Encryption of both metadata and file content is implemented using Argon2id and AES256-GCM which ensures both data confidentiality and authenticity. See the Encryption section below for more information.

Specification

See the FORMAT.md document for the details on the current format, which specifies Zstandard for compression, and the Argon2id key-derivation function, along with the AES256-GCM cipher, for encryption. Future versions may add support for other algorithms as appropriate.

In short, the file consists of a short header which may include encryption details, followed by a manifest of directories, files, and symbolic links which are contained in the following compressed block of content. These content blocks may contain many files, up to a predefined total size, which are then compressed using Zstandard. If using encryption, the manifest and compressed content block will be encrypted with the derived key and a unique nonce. The manifest/content pair can be followed by as many additional pairs as are needed to contain everything that will be written to the archive.

Objectives

First and foremost, the purpose of this project is to satisfy my own needs, and this reference implementation is written in Rust so that I can use it within my own Rust-based applications. If it happens to be useful to others, fantastic, and I would be more than happy to continue developing the format and/or this crate toward that end.

Build and Run

Prerequisites

Running the tests

Unit tests exist that exercise most of the functionality.

cargo test

Creating, listing, extracting archives

Start by creating an archive using the create command. The example below assumes that you have downloaded something interesting into your ~/Downloads directory.

$ cargo run -- create archive.exa ~/Downloads/httpd-2.4.59
...
Added 3138 files to archive.exa

Now that the archive.exa file exists, you can list the contents like so:

$ cargo run -- list archive.exa | head -20
...
httpd-2.4.59/.deps
httpd-2.4.59/.gdbinit
httpd-2.4.59/.gitignore
httpd-2.4.59/ABOUT_APACHE
httpd-2.4.59/Apache-apr2.dsw
httpd-2.4.59/Apache.dsw
httpd-2.4.59/BuildAll.dsp
httpd-2.4.59/BuildBin.dsp
...

Finally, run extract to unpack the contents of the archive into the current directory:

$ cargo run -- extract archive.exa
...
Extracted 3138 files from archive.exa

Code Coverage

Using grcov seems to be the easiest at this time.

export RUSTFLAGS="-Cinstrument-coverage"
export LLVM_PROFILE_FILE="exaf_rs-%p-%m.profraw"
cargo clean
cargo build
cargo test
grcov . -s . --binary-path ./target/debug/ -t html --branch --ignore-not-existing -o ./target/debug/coverage/
open target/debug/coverage/index.html

Encryption

With the --password <PASSWD> option to the commands listed above, the archive can be encrypted using a passphrase. A secret key will be derived using the Argon2id algorithm and a random salt (which is then stored in the archive header), and each run of content in the archive will be encrypted with that secret key and a unique nonce (stored in the header of each manifest) using the AES256-GCM Authenticated Encryption with Associated Data cipher. The encryption includes both the entry metadata as well as the compressed file content.

Prior Art

There are many existing archive formats, many of which have long since fallen out of common use. Those that remain are not without their shortcomings, such as poorly implemented encryption features, or vulnerability to compression factor exploits (zip bomb).

The original motivation to start this project began when O announced the Pack file format. They introduced a novel approach to the problem of archiving and compressing files while lamenting the general lack of progress in this area. A Rust version of that program can be found here -- it's speed and output size are nearly identical to that of this project.

Dependencies

~8MB
~147K SLoC