#parser-combinator #string-parser #tokens #slice #iterator #free #methods

no-std yap

Yet Another Parser library. A lightweight, dependency free, parser combinator inspired set of utility methods to help with parsing strings and slices.

19 releases (11 breaking)

0.12.0 Nov 18, 2023
0.11.0 Jul 14, 2023
0.10.0 Mar 8, 2023
0.8.1 Dec 9, 2022
0.7.1 Nov 27, 2021

#14 in Parser tooling

Download history 7155/week @ 2024-07-20 7489/week @ 2024-07-27 8087/week @ 2024-08-03 7363/week @ 2024-08-10 7896/week @ 2024-08-17 6659/week @ 2024-08-24 5840/week @ 2024-08-31 9549/week @ 2024-09-07 30196/week @ 2024-09-14 36985/week @ 2024-09-21 37245/week @ 2024-09-28 45579/week @ 2024-10-05 44030/week @ 2024-10-12 48824/week @ 2024-10-19 40469/week @ 2024-10-26 52607/week @ 2024-11-02

192,753 downloads per month
Used in 5 crates (4 directly)

MIT license

115KB
2K SLoC

Yap: Yet another (rust) parsing library

API docs

This small, zero-dependency crate helps you to parse input strings and slices by building on the Iterator interface.

The aim of this crate is to provide the sorts of functions you'd come to expect from a parser combinator library, but without immersing you into a world of parser combinators and forcing you to use a novel return type, library-provided errors or parser-combinator based control flow. we sacrifice some conciseness in exchange for simplicity.

Some specific features/goals:

  • Great documentation, with examples for almost every function provided.
  • Prioritise simplicity at the cost of verbosity.
  • Be iterator-centric. Where applicable, combinators return things which implement Tokens/Iterator.
  • Allow user defined errors to be returned anywhere that it might make sense. Some functions have _err variants incase you need error information when they don't otherwise hand back errors for simplicity.
  • Location information should always be available, so that you can tell users where something went wrong. see Tokens::offset and Tokens::location().
  • Backtracking by default. Coming from Haskell's Parsec, this feels like the sensible default. It means that if one of the provided parsing functions fails to parse something, it won't consume any input trying.
  • Expose all of the "low level" functions. You can save and rewind to locations as needed (see Tokens::location), and implement any of the provided functions using these primitives.
  • Aims to be "fairly quick". Avoids allocations (and allows you to do the same via the iterator-centric interface) almost everywhere. If you need "as fast as you can get", there amay be quicker alternatives.

Have a look at the Tokens trait for all of the parsing methods made available, and examples for each.

Have a look in the examples folder for more in depth examples.

Example

use yap::{
    // This trait has all of the parsing methods on it:
    Tokens,
    // Allows you to use `.into_tokens()` on strings and slices,
    // to get an instance of the above:
    IntoTokens
};

// Step 1: convert our input into something implementing `Tokens`
// ================================================================

let mut tokens = "10 + 2 x 12-4,foobar".into_tokens();

// Step 2: Parse some things from our tokens
// =========================================

#[derive(PartialEq,Debug)]
enum Op { Plus, Minus, Multiply }
#[derive(PartialEq,Debug)]
enum OpOrDigit { Op(Op), Digit(u32) }

// The `Tokens` trait builds on `Iterator`, so we get a `next` method.
fn parse_op(t: &mut impl Tokens<Item=char>) -> Option<Op> {
    match t.next()? {
        '-' => Some(Op::Minus),
        '+' => Some(Op::Plus),
        'x' => Some(Op::Multiply),
        _ => None
    }
}

// We also get other useful functions..
fn parse_digits(t: &mut impl Tokens<Item=char>) -> Option<u32> {
    t.take_while(|c| c.is_digit(10))
     .parse::<u32, String>()
     .ok()
}

// As well as combinator functions like `sep_by_all` and `surrounded_by`..
let op_or_digit = tokens.sep_by_all(
    |t| t.surrounded_by(
        |t| parse_digits(t).map(OpOrDigit::Digit),
        |t| { t.skip_while(|c| c.is_ascii_whitespace()); }
    ),
    |t| parse_op(t).map(OpOrDigit::Op)
);

// Now we've parsed our input into OpOrDigits, let's calculate the result..
let mut current_op = Op::Plus;
let mut current_digit = 0;
for d in op_or_digit.into_iter() {
    match d {
        OpOrDigit::Op(op) => {
            current_op = op
        },
        OpOrDigit::Digit(n) => {
            match current_op {
                Op::Plus => { current_digit += n },
                Op::Minus => { current_digit -= n },
                Op::Multiply => { current_digit *= n },
            }
        },
    }
}
assert_eq!(current_digit, 140);

// Step 3: do whatever you like with the rest of the input!
// ========================================================

// This is available on the concrete type that strings
// are converted into (rather than on the `Tokens` trait):
let remaining = tokens.remaining();

assert_eq!(remaining, ",foobar");

No runtime deps