#cargo-toml #tar #cargo #gzip #streaming-parser #file-io

crate_untar

Streaming reader of Cargo’s published package format (.crate tarball)

2 releases

new 1.0.0-rc.2 Nov 4, 2024
1.0.0-rc.1 Aug 2, 2024

#1302 in Parser implementations

Apache-2.0 OR MIT

38KB
667 lines

This library allows inspecting content of Cargo/crates.io packages without writing any temporary files to disk, and mostly without holding the files in memory either.

It's a streaming parser for gzipped tarballs (the .crate files). Additionally, it can perform correctness checks to detect malformed packages (such as duplicate tar paths, paths ambiguous on case-insensitive file systems, symlinks pointing outside of the crate).

use crate_untar::*;

// you'll need other libraries to download the .crate file and verify its cecksum
let mut archive = Unarchiver::new(std::fs::File::open("example.crate")?)?;
let mut tarball = TarballParser::new(&mut archive, "example", "1.0.0")?;

for res in tarball.entries() {
    let (path, file) = res?;
    // filter by path or file.len() if you need
    if path.extension() != Some("rs".as_ref()) {
        continue;
    }

    // process the file here if you want
    // The file implements io::Read too
    let vec = file.into_vec()?;
}

let parsed = tarball.finalize()?;

println!("{:#?}", parsed.cargo_toml);
println!("{:#?}", parsed.cargo_toml_orig);
println!("{:#?}", parsed.cargo_lock);
println!("{:#?}", parsed.cargo_vcs_info);

# Ok::<_, Box<dyn std::error::Error>>(())

Dependencies

~8–16MB
~219K SLoC