1 unstable release

0.1.0 Jul 22, 2021

#514 in Command-line interface

MIT license

52KB
813 lines

Lexopt

Crates.io API reference MSRV CI

Lexopt is an argument parser for Rust. It tries to have the simplest possible design that's still correct. It's so simple that it's a bit tedious to use.

Lexopt is:

  • Small: one file, no dependencies, no macros. Easy to audit or vendor.
  • Correct: standard conventions are supported and ambiguity is avoided. Tested and fuzzed.
  • Pedantic: arguments are returned as OsStrings, forcing you to convert them explicitly. This lets you handle badly-encoded filenames.
  • Imperative: options are returned as they are found, nothing is declared ahead of time.
  • Annoyingly minimalist: only the barest necessities are provided.
  • Unhelpful: there is no help generation and error messages often lack context.

Example

struct Args {
    thing: String,
    number: u32,
    shout: bool,
}

fn parse_args() -> Result<Args, lexopt::Error> {
    use lexopt::prelude::*;

    let mut thing = None;
    let mut number = 1;
    let mut shout = false;
    let mut parser = lexopt::Parser::from_env();
    while let Some(arg) = parser.next()? {
        match arg {
            Short('n') | Long("number") => {
                number = parser.value()?.parse()?;
            }
            Long("shout") => {
                shout = true;
            }
            Value(val) if thing.is_none() => {
                thing = Some(val.into_string()?);
            }
            Long("help") => {
                println!("Usage: hello [-n|--number=NUM] [--shout] THING");
                std::process::exit(0);
            }
            _ => return Err(arg.unexpected()),
        }
    }

    Ok(Args {
        thing: thing.ok_or("missing argument THING")?,
        number,
        shout,
    })
}

fn main() -> Result<(), lexopt::Error> {
    let args = parse_args()?;
    let mut message = format!("Hello {}", args.thing);
    if args.shout {
        message = message.to_uppercase();
    }
    for _ in 0..args.number {
        println!("{}", message);
    }
    Ok(())
}

Let's walk through this:

  • We start parsing with Parser::from_env().
  • We call parser.next() in a loop to get all the arguments until they run out.
  • We match on arguments. Short and Long indicate an option.
  • To get the value that belongs to an option (like 10 in -n 10) we call parser.value().
    • This returns a standard OsString.
    • For convenience, use lexopt::prelude::* adds a .parse() method, analogous to str::parse.
  • Value indicates a free-standing argument. In this case, a filename.
    • if thing.is_none() is a useful pattern for positional arguments. If we already found thing we pass it on to another case.
    • It also contains an OsString.
      • The standard .into_string() method can decode it into a plain String.
  • If we don't know what to do with an argument we use return Err(arg.unexpected()) to turn it into an error message.
  • Strings can be promoted to errors for custom error messages.

This covers almost all the functionality in the library. Lexopt does very little for you.

For a larger example with useful patterns, see examples/cargo.rs.

Command line syntax

The following conventions are supported:

  • Short options (-q)
  • Long options (--verbose)
  • -- to mark the end of options
  • = to separate long options from values (--option=value)
  • Spaces to separate options from values (--option value, -f value)
  • Unseparated short options (-fvalue)
  • Combined short options (-abc to mean -a -b -c)

These are not supported:

  • -f=value for short options
  • Options with optional arguments (like GNU sed's -i, which can be used standalone or as -iSUFFIX)
  • Single-dash long options (like find's -name)
  • Abbreviated long options (GNU's getopt lets you write --num instead of --number if it can be expanded unambiguously)

Unicode

This library supports unicode while tolerating non-unicode arguments.

Short options may be unicode, but only a single codepoint. (If you need whole grapheme clusters you can use a long option. If you need normalization you're on your own, but it can be done.)

Options can be combined with non-unicode arguments. That is, --option=��� will not cause an error or mangle the value. This is surprisingly tricky to support: see os_str_bytes.

Options themselves are patched as by String::from_utf8_lossy if they're not valid unicode. That typically means you'll raise an error later when they're not recognized.

Why?

For a particular application I was looking for a small parser that's pedantically correct. There are other compact argument parsing libraries, but I couldn't find one that handled OsStrings and implemented all the fiddly details of the argument syntax faithfully.

This library may also be useful if a lot of control is desired, like when the exact argument order matters or not all options are known ahead of time. It could be considered more of a lexer than a parser.

Why not?

This library may not be worth using if:

  • You don't care about non-unicode arguments
  • You don't care about exact compliance and correctness
  • You don't care about code size
  • You do care about great error messages
  • You hate boilerplate

See also

  • clap/structopt: very fully-featured. The only other argument parser for Rust I know of that truly handles invalid unicode properly, if used right. Large.
  • argh and gumdrop: much leaner, yet still convenient and powerful enough for most purposes. Panic on invalid unicode.
    • argh adheres to the Fuchsia specification and therefore does not support --option=value and -ovalue, only --option value and -o value.
  • pico-args: slightly smaller than lexopt and easier to use (but less rigorous).
  • ap: I have not used this, but it seems to support iterative parsing while being less bare-bones than lexopt.
  • libc's getopt.

pico-args has a nifty table with build times and code sizes for different parsers. I've rerun the tests and added lexopt (using the program in examples/pico_test_app.rs):

null lexopt pico-args clap gumdrop structopt argh
Binary overhead 0KiB 14.5KiB 13.5KiB 372.8KiB 17.7KiB 371.2KiB 16.8KiB
Build time 0.9s 1.7s 1.6s 13.0s 7.5s 17.0s 7.5s
Number of dependencies 0 0 0 8 4 19 6
Tested version - 0.1.0 0.4.2 2.33.3 0.8.0 0.3.22 0.1.4

(Tests were run on x86_64 Linux with Rust 1.53 and cargo-bloat 0.10.1.)

No runtime deps