#ls #posix #system #replace #libc #faster #output

app fls

A fast POSIX ls that doesn't require a libc

1 unstable release

0.1.0 Aug 2, 2021

#609 in Filesystem

GPL-3.0-or-later

68KB
2K SLoC

fls

A nearly-POSIX-compliant and libc-less ls that's smaller, faster, and prettier than GNU's1.

exa and lsd are both great ls-like Rust programs, but they're slower than the system ls and about 10x the code size. Plus you can't actually replace your ls with one of them, because some software relies on parsing the output of ls. But even as a user experience improvement, I think other projects tell the wrong story; modern software does not need to be larger or slower.

1I don't mean to rag on GNU's ls, but as far as I can tell it's the closest thing along the metrics I value.

Crude benchmarks

--color=never -R / > /dev/null --color=always -R / --color=auto ~ --color=auto -l ~
fls 0.66 s 2.32 s 0.16 ms 0.30 ms
GNU ls 1.22 s 4.37 s 0.38 ms 2.30 ms
exa 3.61 s 63.7 s 3 0.78 ms 3.30 ms 4
lsd ???2 ???2 36.5 ms 36.8 ms

These do not cover all reasonable combinations of options, but if you can find a combination of flags for which fls is slower than any alternatives, please open an issue.

2lsd doesn't detect symlink cycles and thus runs indefinitely on -R /.
3I have some large directories of fuzzing corpora; from running perf top as I was collecting this data, I see exa spends most of its time in term_grid::Grid::column_widths. I suspect its grid layout algorithm is quadratic.
4In all cases I report wall time; this is the only case where CPU time is significantly different. exa's CPU time is ~2.2x this value.

"libc-less"

fls does not link to anything. The (stripped) fls executable is smaller than GNU's (stripped) ls executable, even though some of the code that powers GNU's is in another file.

smaller and faster?

The biggest impact on code size is #![no_std], because the standard library's runtime is relatively large. Most individual components of the standard library are a totally reasonable size, but the code for generating backtraces is huge and as far as I can tell #![no_std] is the only way to get rid of it. The rest of the code size was trimmed down mostly by running the excellent tool cargo bloat to identify places to replace generics with runtime dispatch, and just manually reviewing the code to factor out repeated code patterns.

In terms of speed, fls is probably faster than GNU's ls because it doesn't use the POSIX interfaces for listing files. We directly call getdents64 and parse the output, instead of juggling calls to read_dir. And since we're calling getdents64, we get access to the optional directory entry type information, which usually lets us omit a number of stat calls, which can be expensive relative to other filesystem syscalls. I say probably because fls has always been faster than GNU's ls. The original goal was just to use getdents64 directly (see below), and as soon as I had a working prototype, it was faster than the competition.

--color=auto

fls has the same interpretation as GNU ls for --color=always and --color=never, but under --color=auto, fls will only apply colors based on file extension and the information available from getdents64, which is optional. Thus, the coloring of fls --color=auto is unpredictable, but you get some coloring of output without any expensive stat calls. fls was originally developed when my dev environment was a compute node with an HPC filesystem, and ls --color=always on large directories could take seconds to minutes. fls --color=auto provides the same colors in those directories, in the blink of an eye. Thus, --color=auto is the assumed if no arguments are provided and stdout is a terminal.

Sorting

In the absence of any options, fls sorts names using a comparsion function similar to ls -v, which attempts to treat runs of digits as a single number. You don't need to pad numbers in filenames to a fixed width to make them display in the intuitive order.

POSIX features:

  • -A do not list implied . and ..
  • -C list entries in columns
  • -F append an indicator to entries
  • -H follow symlinks when provided on the command line
  • -L always follow symlinks
  • -R recurse into subdirectories
  • -S sort by size
  • -a do not ignore entries whose names begin with .
  • -c sort by ctime
  • -d list directories themselves, not their contents
  • -f do not sort
  • -g long format but without owner
  • -i print each entry's inode
  • -k pretend block size is 1024 bytes
  • -l long format
  • -m single row, separated by ,
  • -n long format but list uid and gid instead of names
  • -o long format but without groups
  • -p append an indicator to directories
  • -q replace non-printable characters with ?
  • -r reverse sorting order
  • -s print size of each file in blocks
  • -t sort by modification time
  • -u sort by access time
  • -x sort entries across rows
  • -1 list one entry per line

Dependencies

~2MB
~41K SLoC