1 unstable release
new 0.5.1 | Dec 16, 2024 |
---|
#997 in Command line utilities
26KB
358 lines
minifind
About
minifind
is a barebones Un*x find
tool implementation in Rust, meant just to list directory entries as fast as possible and little else. For filename or path matching, it is possible to use --name
or --regex
options, toggling case insensitivity with --case-insensitive
or not. Additionally to narrow down matches, it is possible to use --file-type
option and filter by file type (b
for block device, c
for character device, d
for directory, p
for named FIFO, f
for file, l
for symlink, s
for socket or e
for empty file/directory).
It will not follow filesystem symlinks and it will not cross filesystem boundaries by default. Number of threads used is set to the number of available CPU cores in the system.
Related projects
Let us also mention other notable projects dealing with this task:
- sharkdp/fd which is a much more featured find alternative but with excellent performance,
- LyonSyonII/hunt-rs, a very similar high performance-oriented tool,
- BurntSushi/ripgrep which also houses globset and ignore crates that are used in this project,
- Rust findutils reimplementation that can be used as drop-in replacement.
Usage
Usage: minifind [OPTIONS] <PATH>...
Arguments:
<PATH>... Paths to check for large directories
Options:
-f, --follow-symlinks <FOLLOW_SYMLINKS> Follow symlinks [default: false] [short aliases: L] [possible values: true, false]
-o, --one-filesystem <ONE_FILESYSTEM> Do not cross mount points [default: true] [aliases: xdev] [possible values: true, false]
-x, --threads <THREADS> Number of threads to use when calibrating and scanning [default: 20]
-d, --max-depth <MAX_DEPTH> Maximum depth to traverse
-n, --name <NAME> Base of the file name matching globbing pattern
-r, --regex <REGEX> File name (full path) matching regular expression pattern
-i, --case-insensitive <CASE_INSENSITIVE> Case-insensitive matching for globbing and regular expression patterns [default: false] [possible values: true, false]
-t, --file-type <FILE_TYPE> Filter matches by type. Also accepts 'b', 'c', 'd', 'p', 'f', 'l', 's' and 'e' aliases [default: directory file symlink]
[possible values: empty, block-device, char-device, directory, pipe, file, symlink, socket]
-h, --help Print help
-V, --version Print version
Regular expressions
--regex
option uses Rust regex syntax that is very similar to other engines but without support for look-around and backreferences.
Glob expressions
--name
option uses Unix-style glob syntax.
Minifind vs GNU find
Hardware: 8-core Xeon E5-1630 with 4-drive SATA RAID-10
Benchmark setup:
$ cat bench1.sh
#!/bin/dash
exec /usr/bin/find / -xdev
$ cat bench2.sh
#!/bin/dash
exec /usr/local/sbin/minifind /
Benchmark 1: ./bench1.sh
Time (mean ± σ): 4.655 s ± 0.160 s [User: 1.287 s, System: 3.366 s]
Range (min … max): 4.525 s … 5.016 s 10 runs
Benchmark 2: ./bench2.sh
Time (mean ± σ): 1.244 s ± 0.020 s [User: 3.921 s, System: 5.908 s]
Range (min … max): 1.199 s … 1.271 s 10 runs
Summary
./bench2.sh ran
3.74 ± 0.14 times faster than ./bench1.sh
Dependencies
~9–19MB
~282K SLoC