6 releases

0.1.5 May 27, 2021
0.1.4 May 2, 2021
0.1.3 Apr 25, 2021

#961 in Command-line interface


Used in rout

Apache-2.0

110KB
2.5K SLoC

ap - Argument Parser

Overview

ap is a rust package (crate) that parses command-line arguments.

It's small and simple, but has some interesting features.

Full Details

See the documentation here.


lib.rs:

Simple crate for parsing command-line arguments.

If you want lots of extra features, you should consider the excellent clap crate instead.

To understand what "simple" means, see the Limitations section.


Table of contents:


Overview

This crate is used to parse command-line arguments. It calls a handler function for each registered option it parses.

Quickstart

Note: If you are not familiar with command-line handling, see the terminology section.

  1. Create a struct type to represent the handler which will be used to process all your options.

    The struct can be empty if you just want to be notified when a particular option is specified, or your could specify members to record or calculate particular details.

    #[derive(Clone, Debug, Default)]
    struct MyHandler {}
    
  2. Implement the [Handler] trait for the struct.

    This trait only requires you to create a single handle() method which returns a [Result] to indicate success or failure.

    #
    #
    impl Handler for &mut MyHandler {
        fn handle(&mut self, arg: Arg) -> Result<()> {
            // ...
    
            Ok(())
        }
    }
    
  3. Create a handler variable for your struct type.

    #
    #
    #
    #
    let mut handler = MyHandler::default();
    
  4. Create an [Args] variable to hold all the arguments you wish to support.

    #
    let mut args = Args::default();
    
  5. Add a new [Arg] for each argument you wish to support to the [Args] variable.

    As a minimum, you must specify a "name" (single-character short option value) for the argument.

    By default, options are "flags" (see the Terminology section).

    #
    #
    // Support "-a <value>" option.
    args.add(Arg::new('a').needs(Need::Argument));
    
    // Support "-d" flag option.
    args.add(Arg::new('d'));
    
  6. Create an [App] variable to represent your program, specifying the [Args] and [Handler] variables:

    #
    #
    #
    #
    #
    #
    #
    let mut args = App::new("my app")
        .help("some text")
        .args(args)
        .handler(Box::new(&mut handler));
    
  7. Call the parse() method on the [App] variable. The handler will be called for all [Arg] arguments added to the [Args] variable:

    #
    #
    #
    #
    #
    #
    #
    #
    // Parse the command-line
    let result = args.parse();
    

Examples

Below is a full example showing how to write a program that supports a few command line options. It also shows how the handler can modify it's state, allowing stateful and conditional option handling.

use ap::{App, Arg, Args, Handler, Need, Result};

// The type that will be used to handle all the CLI options
// for this program.
#[derive(Clone, Debug, Default)]
struct MyHandler {
    i: usize,
    v: Vec<String>,
    s: String,
}

impl Handler for &mut MyHandler {
    fn handle(&mut self, arg: Arg) -> Result<()> {
        println!(
            "option: {:?}, value: {:?}, count: {}",
            arg.option, arg.value, arg.count
        );

        // Change behaviour if user specified '-d'
        if arg.option == 'd' {
            self.i += 7;
        } else {
            self.i += 123;
        }

        self.s = "string value set by handler".into();
        self.v.push("vector modified by handler".into());

        Ok(())
    }
}

fn main() -> Result<()> {
    let mut handler = MyHandler::default();

    println!("Initial state of handler: {:?}", handler);

    let mut args = Args::default();

    // Support "-a <value>" option.
    args.add(Arg::new('a').needs(Need::Argument));

    // Support "-b <value>" option.
    args.add(Arg::new('b').needs(Need::Argument));

    // Support "-d" flag option.
    args.add(Arg::new('d'));

    let mut args = App::new("my app")
        .help("some text")
        .args(args)
        .handler(Box::new(&mut handler));

    // Parse the command-line
    let result = args.parse();

    // If you want to inspect the handler after parsing, you need to
    // force ownership to be returned by dropping the `Args` variable.
    drop(args);

    println!("Final state of handler: {:?}", handler);

    // return value
    result
}

For further examples, try out the programs in the examples/ directory:

$ cargo run --example simple -- -a foo -d -a bar -d -a baz
$ cargo run --example positional-args-only -- one two "hello world" three "foo bar" four "the end"
$ cargo run --example option-and-positional-args -- "posn 1" -d "posn 2" -a "hello world" -a "foo bar" "the end" -d
$ cargo run --example error-handler -- -a -e -i -o -u

Details

Terminology

Note: For further details, see getopt(3).

  • An "argument" is a value passed to a program on the command-line.

    Arguments can be "options" or "positional arguments".

    Note: A single or double quoted string counts as one argument, even if that string comprises more than one word (this magic is handled by the shell).

  • An "option" is an argument that starts with a dash character (-) and ends with a single character which is itself not -, for example, -a, -z, -A, -Z, -0, -9, etc.

    This character is the options "name". Option names are case sensitive: upper and lower-case characters represent different options.

    This type of option is known as a "short option" since it is identified with only a single character.

  • Options which accept an argument (a value, called an "option argument" or "optarg" in getopt(3) parlance) are often referred to simply as "options" since these are the commonest form of options.

  • An "option argument" is the value that immediately follows an option. It is considered to be "bound" or paired with the option immediately preceding it. By definition, the option argument cannot start with a dash to avoid it being considered an option itself.

  • Options that do not accept an argument are call "flags" or "stand-alone options". These tend to be used to toggle some functionality on or off.

    Examples of flags:

    Most programs support a few common flags:

    • -h: Display a help/usage statement and exit.
    • -v: Display a version number and exit, or sometimes enable verbose mode.
  • A "positional argument" (also known as a "non-option argument") is an argument that is not an option: it is a word or a quoted string (which cannot start with a dash, unless it is escaped as \-).

    Example of positional arguments:

    echo(1) is a good example of a program that deals with positional arguments:

    $ echo one two three "hello, world" four five "the end"
    
  • The special option -- is reserved to mean "end of all options": it can be used by programs which need to accept a set of options followed by a set of positional arguments. Even if an argument starting with a single dash follows the double-dash, it will not be considered an option.

    This crate will stop processing command-line arguments if it finds -- on the command-line.

Example of argument types

Assume a program that is run as follows:

$ myprog -d 371 "hello, world" -x "awesome value" -v "the end"

The program has 7 actual CLI arguments:

1: '-d'
2: '371'
3: 'hello, world'
4: '-x'
5: 'awesome value'
6: '-v'
7: 'the end'

How these arguments are interpreted depends on whether each of the options (the arguments starting with -) are specified as taking a value.

If all the options are specified as flags, the arguments are interpreted as follows:

'-d'            # A flag option.
'371'           # A positional argument.
'hello, world'  # A positional argument.
'-x'            # A flag option.
'awesome value' # A positional argument.
'-v'            # A flag option.
'the end'       # A positional argument.

But if we assume that -d and -x are specified as taking a value, then the arguments group as follows:

'-d 371'           # An option ('d') with a numeric option argument ('371').
'hello, world'     # A positional argument ('hello, world').
'-x awesome value' # An option ('x') with a string option argument ('awesome value').
'-v'               # A flag option.
'the end'          # A positional argument.

Alternatively, if we assume that all the options take a value, then the arguments group as follows:

'-d 371'             # An option ('d') with a numeric option argument ('371').
'hello, world'       # A positional argument ('hello, world').
'-x 'awesome value'' # An option ('x') with a string option argument ('awesome value').
'-v 'the end''       # An option('v') with a string option argument ('the end').

Handling ambiguity

By default, getopt(3) semantics are used, which means there is no ambiguity when parsing arguments: if an [Arg] is declared that specifies Need::Argument, the next argument after the option argument (whatever it is!) is consumed and used as the options argument.

Although not ambiguous, this could be surprising for users, since other (generally newer) command-line arguments parsers work in a subtly different way. For example, imagine the program specified the following:

#
let mut args = Args::default();

args.add(Arg::new('1'));
args.add(Arg::new('2').needs(Need::Argument));

If the program was then called as follows...

$ prog -2 -1

... the value for the -2 option will be set to -1, and the -1 option will assume to no thave been specified. This is how getopt(3) works. However, this may not be what the user envisaged, or the programmer desires.

The alternative strategy when parsing the command-line above is to treat options as more important than arguments and error in this case since the -2 option was not provided with an argument (because the -1 option was specified before a valid option argument.

For further details on this subtlety, see the no_strict_options() method for [App] or [Settings].

Rationale

Why yet another command-line parser?

There are many rust CLI argument parsing crates. This one was written since I couldn't find a crate that satisfied all of the following requirements:

  • Allow the intermingling of options and non-options ("positional arguments").

    This is an extremely useful feature for certain use cases and is a standard library call available in POSIX-compliant libc implementations. Quoting from getopt(3):

    If the first character of optstring is '-', then each nonoption argv-element is handled as if it were the argument of an option with character code 1.

  • Parse the command-line arguments in order.

    The modern fashion seems to be to build a hash of options to allow the program to query if an option was specified. This is useful in most circumstances, but I had a use-case which required the order and number of occurences of each option to be important.

  • Allow a handler function to be specified for dealing with the arguments as they are encountered.

In summary, I needed a more POSIX-like (POSIXLY_CORRECT) command line argument parser, so, here it is.

Summary of features and behaviour

  • Simple and intuitive ("ergonomic") API.

  • Small codebase.

  • Comprehensive set of unit tests.

  • Parses arguments in strict order.

  • Handles each argument immediately.

    As soon as a (registered and valid) argument is encountered, the handler is called.

  • Arguments are not permuted.

  • Requires a function to be specified for handling each option.

  • Option arguments are always returned as strings.

    The caller can convert them into numerics, etc as required.

  • Allows intermingling of flag options, options-with-arguments and "positional arguments" (arguments that are not options).

  • Allows options to be specified multiple times (and records that number).

    Note: You can limit the number of occurences if you wish by checking the Arg.count value in your handler.

  • Options can be defined as mandatory.

  • Unknown options can be configured to be ignored or passed to the handler.

    Notes:

    • By default, unknown options are not handled and an error is generated if one is found.
    • If you want to support positional arguments, either register an [Arg] for [POSITIONAL_HANDLER_OPT], or set Settings.ignore_unknown_options.
  • "Unknown" positional arguments can be configured to be ignored or passed to the handler.

    Notes:

    • By default, positional arguments are not handled and an error is generated if one is found.
    • If you want to support positional arguments, either register an [Arg] for [POSITIONAL_HANDLER_OPT], or set Settings.ignore_unknown_posn_args.
  • Automatically generates help / usage statement (-h).

Limitations

  • Option bundling is not supported

    Example: -d -v -a "foo bar" is valid, but -dva "foo bar" is not.

  • Non-ASCII option names are not supported.

  • Long options are not supported

    Example: -v is valid, but --verbose is invalid.

  • Options and their arguments must be separated by whitespace.

    Example: '-d 3' is valid, but '-d3' is valid.

  • Options with optional arguments are not supported.

    Explanation: An option has to be defined as being a flag (no argument) or a standard option (requiring a value). It cannot be both.

  • Options cannot accept multiple values

    Example: -a "foo bar" "baz" "the end" cannot be parsed as a single -a option.

    However:

    • Options can be specified multiple times, each with different values.
    • You could parse that command-line using [POSITIONAL_HANDLER_OPT].

Dependencies

~305–770KB
~18K SLoC