#directory #walk #parallel #recursion #iterator

jwalk-meta

Filesystem walk performed in parallel with streamed and sorted results

5 releases

0.9.4 Apr 1, 2024
0.9.3 Mar 15, 2024
0.9.2 Sep 10, 2023
0.9.1 Apr 27, 2023
0.9.0 Apr 27, 2023

#303 in Filesystem

Download history 1/week @ 2024-02-19 26/week @ 2024-02-26 139/week @ 2024-03-11 24/week @ 2024-03-18 271/week @ 2024-04-01

434 downloads per month
Used in 2 crates (via scandir)

MIT license

77KB
1.5K SLoC

jwalk-meta

Filesystem walk.

  • Performed in parallel using rayon
  • Entries streamed in sorted order
  • Custom sort/filter/skip/state

This is a fork of https://github.com/Byron/jwalk. This project adds optional collecting metadata to improve performance if metadata is needed later.

Build Status

Usage

Add this to your Cargo.toml:

[dependencies]
jwalk-meta = "0.9"

Lean More: docs.rs/jwalk-meta

Example

Recursively iterate over the "foo" directory sorting by name:

use jwalk_meta::{WalkDir};

for entry in WalkDir::new("foo").sort(true) {
  println!("{}", entry?.path().display());
}

Inspiration

This crate is inspired by both walkdir and ignore. It attempts to combine the parallelism of ignore with walkdir's streaming iterator API. Some code and comments are copied directly from walkdir.

Why use this crate?

This crate is particularly good when you want streamed sorted results. In my tests it's about 4x walkdir speed for sorted results with metadata. Also this crate's process_read_dir callback allows you to arbitrarily sort/filter/skip/state entries before they are yielded.

Why not use this crate?

Directory traversal is already pretty fast. If you don't need this crate's speed then walkdir provides a smaller and more tested single threaded implementation.

This crates parallelism happens at the directory level. It will help when walking deep file systems with many directories. It wont help when reading a single directory with many files.

Benchmarks

Benchmarks comparing this crate with walkdir and ignore.

Dependencies

~1.5MB
~27K SLoC