#performance #number #summary #tool #directories #contents

app dirscan

A high performance tool for summarizing large directories or drives

9 releases (5 stable)

1.4.0 Mar 29, 2021
1.3.0 Mar 29, 2021
0.3.0 Apr 12, 2020
0.2.0 Apr 6, 2020
0.1.1 Apr 4, 2020
Download history 26/week @ 2021-04-11 20/week @ 2021-04-18 4/week @ 2021-04-25 3/week @ 2021-05-02 6/week @ 2021-05-09 14/week @ 2021-05-16 9/week @ 2021-06-06 2/week @ 2021-06-27 1/week @ 2021-07-04 11/week @ 2021-07-18 9/week @ 2021-07-25

79 downloads per month

MIT license

740KB
713 lines

Dirscan Crates.io Actions Status

Dirscan is a high-performance tool for quickly inspecting the contents of huge (possibly networked) disks. It provides a summary of every single directory on a given disk, complete with the number of files within, their total size, and the latest time a file was created, accessed or modified.

It's designed for disks that are too large to inspect with traditional tools, and it:

  • Is many orders of magnitudes faster than tools like du, find or tree
  • Can max out any disk you give it, assuming you have enough CPU resources to keep up.
  • Produces a simple JSON or CSV output that can be analysed by the built in viewer or other tools
  • Supports a customisable number of threads
  • Streams results to the output file, keeping relatively constant memory usage with any sized disk.

Table of Contents

Install šŸ’æ

Homebrew (MacOS + Linux)

brew tap orf/brew, then brew install dirscan

Binaries (Windows)

Download the latest release from the github releases page. Extract it and move it to a directory on your PATH.

Cargo

For optimal performance run cargo install dirscan

Docker

This project is packaged as a Docker container as tomforbes/dirscan.

Running docker run -vYOUR_DIRECTORY:/dir tomforbes/dirscan scan /dir will scan YOUR_DIRECTORY.

Usage šŸŽ·

Scan a directory

You can start scanning a directory by executing:

dirscan scan [PATH] --output=[OUTPUT]

This will scan [PATH] and output all results, in JSON format, to [OUTPUT]. By default it will use a thread pool with 2 * number_of_cores threads, but you can customize this. Depending on your disk speed the number of threads can drastically improve performance:

dirscan scan [PATH] --output=[OUTPUT] --threads=20

You can also output the results in CSV:

dirscan scan [PATH] --output=[OUTPUT] --format=csv

$ dirscan scan ~/ --output=output.json --threads=20
[00:00:15] Files/s: 17324/s | Total: 258734 | Size: 99.01GB | Components: 14291 | Errors: IO=0 Other=36

Stream results

You can stream all files to stdout by executing:

dirscan stream [PATH]

If you wanted to remove all files in a disk in parallel, you could create a pipeline like:

dirscan stream /my-dir | xargs -d ā€˜\nā€™ -L10 -P500

This would launch up to 500 rm processes, each deleting 10 files.

Inspect results

Once a scan is complete you can inspect the output using:

dirscan parse [OUTPUT]

For example:

$ dirscan parse output.json --prefix=/System/
[00:00:02] Total: 580000 | Per sec: 220653/s
+----------------------+---------+----------+-------------+-------------+-------------+
| Prefix               | Files   | Size     | created     | accessed    | modified    |
+----------------------+---------+----------+-------------+-------------+-------------+
| /System/Applications | 57304   | 777.28MB | 2 weeks ago | 2 weeks ago | 2 weeks ago |
| /System/DriverKit    | 55      | 5.09MB   | 2 weeks ago | 2 weeks ago | 2 weeks ago |
| /System/Library      | 292190  | 13.56GB  | 7 hours ago | 1 hour ago  | 7 hours ago |
| /System/Volumes      | 1468296 | 197.93GB | 1 hour ago  | 1 hour ago  | 1 hour ago  |
| /System/iOSSupport   | 13856   | 600.20MB | 2 weeks ago | 2 weeks ago | 2 weeks ago |
+----------------------+---------+----------+-------------+-------------+-------------+

You can include more directories with the --depth flag, or change the prefix search with --prefix.

You can also order the results by name (the default), size or files:

$ dirscan parse output.json --prefix=/System/ --sort=size
[00:00:02] Total: 580000 | Per sec: 220653/s
+----------------------+---------+----------+-------------+-------------+-------------+
| Prefix               | Files   | Size     | created     | accessed    | modified    |
+----------------------+---------+----------+-------------+-------------+-------------+
| /System/Volumes      | 1468296 | 197.93GB | 2 hours ago | 2 hours ago | 2 hours ago |
| /System/Library      | 292190  | 13.56GB  | 7 hours ago | 2 hours ago | 7 hours ago |
| /System/Applications | 57304   | 777.28MB | 2 weeks ago | 2 weeks ago | 2 weeks ago |
| /System/iOSSupport   | 13856   | 600.20MB | 2 weeks ago | 2 weeks ago | 2 weeks ago |
| /System/DriverKit    | 55      | 5.09MB   | 2 weeks ago | 2 weeks ago | 2 weeks ago |
+----------------------+---------+----------+-------------+-------------+-------------+

Dependencies

~10MB
~184K SLoC