#pest-parser #ast #facade #typed #rules

macro pestle

typed AST facade for the pest parser

1 unstable release

new 0.1.0 Dec 14, 2024

#14 in #facade

0BSD license

14KB
268 lines

pestle: typed AST facade for the pest parser

A code generator which produces ergonomic Rust structs and enums from a pest grammar. These types are meant for examining a parsed AST—not modifying it or constructing a new one.

The grammar must obey one restriction: the choice operator (|) may appear only in rules which use no other operators (these become Rust enums), and all choices must be named rules (these become the variant names).

Atomic rules become tuple structs containing only a Span.

Other rules become ordinary structs with fields named according to the types of their children. All children of the same type are accumulated into one field; depending on cardinality, this may be a value, an Option, or a Vec. Each child type appears in parse order, but if there are multiple types, relative ordering is lost. (Many rules will be unambiguous, but consider e.g. { A* ~ B ~ A* }. The original order could be deduced from spans, however.)

All children are stored as references; this allows representing recursive or mutually recursive rules. Since Span already infects all the types with a lifetime parameter (representing the input string), the child references are given the same lifetime, and an allocator is required during construction.

The generated types, in addition to their public fields, have a span() accessor, and a static function build() which can construct them from a Pair and an allocator.

Example

#[derive(pest_derive::Parser, pestle::TypedRules)]
#[grammar = "src/expr.pest"]
#[typed_mod = "ast"]
pub struct Parser;

pub fn parse<'i>(source: &'i str, arena: &'i Bump) -> &'i ast::Expr<i'> {
    let pair = Parser::parse(Rule::Expr, source).unwrap().next().unwrap();
    ast::Expr::build(pair, arena)
}
Why

Written out of frustration after examining all the alternatives, implementing a complete system atop pest_typed_derive, and discovering it adds quadratic overhead to parsing.

Dependencies

~2–2.7MB
~54K SLoC