2 releases
new 0.1.2 | Feb 25, 2025 |
---|---|
0.1.1 | Feb 25, 2025 |
#879 in Filesystem
38 downloads per month
83KB
2K
SLoC
Parsing and reading of SquashFS archives, on top of any
implementor of the tokio::io::AsyncRead
and tokio::io::AsyncSeek
traits.
More precisely, this crate provides:
- A
SquashFs
structure to read SquashFS archives on top of any asynchronous reader. - An implementation of
fuser_async::Filesystem
onSquashFs
, allowing to easily build FUSE filesystems using SquashFS archives. - A
squashfuse-rs
binary for mounting SquashFS images via FUSE, with async IO and multithreaded decompression.
Motivation: multithreaded/async SquashFS reading
The main motivation was to provide a squashfuse
implementation that could:
- Decompress blocks in parallel.
- Benefit from async I/O when relevant (mostly with the case of a networked backend in mind), with easy integration with
tokio::io
.
To the author's understanding, squashfuse
uses a single-threaded FUSE loop, and while the kernel driver does multithreaded decompression (when compiled with this option), it doesn't support parallel reads. Note that a patch exists to add multi-threading to the low-level squashfuse see Benchmarks.
Example: squashfs-on-S3
This crate has been used to implement a FUSE filesystem providing transparent access to squashfs images hosted on an S3 API, using the S3 example in fuser_async
. With a local MinIO server, throughput of 365 MB/s (resp. 680 MB/s) are achieved for sequential (resp. parallel) access to zstd1-compressed images with 20 MB files.
squashfuse-rs
binary
The squashfuse-rs
binary is an example that implements an analogue to squashfuse
using this crate, allowing to mount squashfs images via FUSE.
$ squashfuse-rs --help
USAGE:
squashfuse-rs [OPTIONS] <INPUT> <MOUNTPOINT>
ARGS:
<INPUT> Input squashfs image
<MOUNTPOINT> Mountpoint
OPTIONS:
--backend <BACKEND> [default: memmap] [possible values: tokio, async-fs, memmap]
--cache-mb <CACHE_MB> Cache size (MB) [default: 100]
-d, --debug
--direct-limit <DIRECT_LIMIT> Limit (B) for fetching small files with direct access [default: 0]
-h, --help Print help information
--readers <READERS> Number of readers [default: 4]
Benchmarks
The following benchmarks (see tests/
) compute the mean and standard deviation of 10 runs, dropping caches after each run, with the following variations:
- Sequential or parallel (with 4 threads) read.
- Compressed archive (gzip and zstd1) or not.
- Different backends for reading the underlying file in
squashfuse-rs
. - The archives are either:
- Case A: Containing sixteen random files of 20 MB each, generated by
tests/testdata.rs
(note that given that the files are random, the zstd compression has minimal effect on the data blocks). - Case B: Containing three hundred 20 MB images (with a compression ratio of 1.1 with zstd-1).
- Case A: Containing sixteen random files of 20 MB each, generated by
Entries are normalized by (case, comp)
pair (i.e. pairs of rows) with respect to the duration of the sequential squashfuse
run. Number smaller than 1 indicate faster results than this baseline. The last 3 columns are squashfuse-rs
with different backends (MemMap
being the most performant).
Case | Comp. | squashfuse |
squashfuse_ll_mt |
MemMap |
Tokio |
AsyncFs |
|
---|---|---|---|---|---|---|---|
A | Seq | - | 1 | 1.16 | 1.01 | 1.93 | 1.56 |
Par | - | 1.8 | 0.5 | 0.54 | 0.8 | 0.76 | |
Seq | gzip | 1 | 0.92 | 0.94 | 1.79 | 1.48 | |
Par | gzip | 2.07 | 0.46 | 0.51 | 0.75 | 0.71 | |
Seq | zstd1 | 1 | 0.96 | 1.04 | 1.78 | 1.47 | |
Par | zstd1 | 2.35 | 0.48 | 0.51 | 0.76 | 0.71 | |
B | Seq | - | 1 | 0.89 | 0.93 | 2.08 | 1.43 |
Par | - | 1.6 | 0.54 | 0.6 | 0.89 | 0.91 | |
Seq | zstd1 | 1 | 0.59 | 0.65 | 0.98 | 0.87 | |
Par | zstd1 | 1.07 | 0.3 | 0.35 | 0.3 | 0.54 |
Smaller numbers are better; numbers smaller than 1 denote an improvement over the baseline
[!WARNING]
These should be updated with the latest versions of the code and ofsquashfuse
.
To execute the tests (case A), cargo
needs to run with root privileges to be able to clear caches between runs, e.g.
$ N_RUNS=10 CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' cargo test -r --test main -- --nocapture
Differences with similar crates
squashfs
is a work in progress that only supports parsing some structures (superblock, fragment table, uid/gid table).backhand
and this crate were implemented indendently at roughly the same time. Some differences are (see also Limitations below):- The primary goal of this crate was to allow mounting squashfs images with FUSE, with async IO and multithreaded decompression.
backend
uses a synchronousstd::io::Read
/std::io::Seek
backend, while this crate uses atokio::io::AsyncRead
/tokio::io::AsyncSeek
backend. - This crates provide caching for decompressed blocks.
backhand
supports write operations, whilesquashfs-async
doesn't.
- The primary goal of this crate was to allow mounting squashfs images with FUSE, with async IO and multithreaded decompression.
squashfs-ng-rs
wraps the C API, while this crate is a pure Rust implementation.
References
Limitations/TODOs
- For now, only file and directory inodes are supported.
- The tables are loaded into memory on initial parsing for caching, rather than being accessed lazily.
- ...
Dependencies
~18–31MB
~499K SLoC