1 unstable release

Uses old Rust 2015

0.1.0 Apr 11, 2019

#49 in #white-space

MIT license

41KB
998 lines

nom whitespace

nom-whitespace provides space eating nom combinators.


lib.rs:

Support for whitespace delimited formats

a lot of textual formats allows spaces and other types of separators between tokens. Handling it manually with nom means wrapping all parsers like this:

named!(token, delimited!(space, tk, space));

To ease the development of such parsers, you can use the whitespace parsing facility, which works as follows:

named!(tuple<&[u8], (&[u8], &[u8]) >,
  ws!(tuple!( take!(3), tag!("de") ))
);

assert_eq!(
  tuple(&b" \t abc de fg"[..]),
 Ok((&b"fg"[..], (&b"abc"[..], &b"de"[..])))
);

The ws! combinator will modify the parser to intersperse space parsers everywhere. By default, it will consume the following characters: " \t\r\n".

If you want to modify that behaviour, you can make your own whitespace wrapper. As an example, if you don't want to consume ends of lines, only spaces and tabs, you can do it like this:

named!(pub space, eat_separator!(&b" \t"[..]));

#[macro_export]
macro_rules! sp (
  ($i:expr, $($args:tt)*) => (
    {
      use nom::Convert;
      use nom::Err;

      match sep!($i, space, $($args)*) {
        Err(e) => Err(e),
        Ok((i1,o))    => {
          match space(i1) {
            Err(e) => Err(Err::convert(e)),
            Ok((i2,_))    => Ok((i2, o))
          }
        }
      }
    }
  )
);

named!(tuple<&[u8], (&[u8], &[u8]) >,
  sp!(tuple!( take!(3), tag!("de") ))
);

assert_eq!(
  tuple(&b" \t abc de fg"[..]),
 Ok((&b"fg"[..], (&b"abc"[..], &b"de"[..])))
);

This combinator works by replacing each combinator with a version that supports wrapping with separator parsers. It will not support the combinators you wrote in your own code. You can still manually wrap them with the separator you want, or you can copy the macros defined in src/whitespace.rs and modify them to support a new combinator:

  • copy the combinator's code here, add the _sep suffix
  • add the $separator:expr as second argument
  • wrap any sub parsers with sep!($separator, $submac!($($args)*))
  • reference it in the definition of sep! as follows:
 ($i:expr,  $separator:path, my_combinator ! ($($rest:tt)*) ) => {
   wrap_sep!($i,
     $separator,
     my_combinator_sep!($separator, $($rest)*)
   )
 };

Dependencies

~1MB
~17K SLoC