10 stable releases

1.0.9 Dec 19, 2022
1.0.8 Jul 1, 2021
1.0.7 Jun 16, 2020
1.0.6 Aug 18, 2019
1.0.0 Jun 18, 2018

#2274 in Rust patterns

Download history 44468/week @ 2023-11-21 44021/week @ 2023-11-28 38515/week @ 2023-12-05 36034/week @ 2023-12-12 26750/week @ 2023-12-19 13203/week @ 2023-12-26 26889/week @ 2024-01-02 33464/week @ 2024-01-09 40259/week @ 2024-01-16 37200/week @ 2024-01-23 35241/week @ 2024-01-30 40632/week @ 2024-02-06 41555/week @ 2024-02-13 42889/week @ 2024-02-20 40266/week @ 2024-02-27 28208/week @ 2024-03-05

162,583 downloads per month
Used in 317 crates (8 directly)

MIT/Apache

105KB
2.5K SLoC

Token Tree Calling Convention

github crates.io docs.rs build status

This library is an attempt at seeding an ecosystem of high-quality modular interoperable tt-muncher building blocks.

Tt-munching is a powerful technique for parsing macro_rules input grammars of significant complexity. In building more and more sophisticated tt-muncher macros it becomes valuable to share code for dealing with certain common input patterns, rather than reimplementing support for those patterns in a low quality and poorly tested way each time.

The core macros provided by this library are tt_call! and tt_return!. Together these provide a flexible way to propagate input and output tokens along a recursive descent call hierarchy. One may also view them as a flexible library-only stable implementation of eager expansion for macro_rules.

[dependencies]
tt-call = "1.0"

Version requirement: tt-call requires a Rust compiler version 1.31 or newer.

Calling convention rules

  • Macros that conform to tt-call must be invoked with curly braces.

    some_macro! {
        /* ... */
    }
    

    The Rust grammar is very particular about punctuation after parenthesized and square bracketed macro invocations. In expression or type position they must not be followed by a semicolon. In item or statement position they are required to be followed by a semicolon. The inconsistency is applied transitively to any helper macros they forward to, and means that parenthesized and square bracketed macro invocations must decide whether to support expression and type position only or item and statement position only. They cannot support both, which is a problem for broadly applicable macro building blocks.

    There is no such punctuation requirement after curly brace invocations. Consistently using curly braces makes the same macro building blocks usable in any syntactic position.

  • Input and output values must be passed in the following key-value form.

    $key:ident = [{ $($value:tt)* }]
    

    This is enforced by the tt_call! and tt_return! macros. The consistency is important for composability and makes it possible to write higher-order macros that operate on the input or output of an arbitrary tt-call macro.

    Except in libraries intended specifically as tt-call building blocks, generally tt-call macros will be private #[doc(hidden)] helpers with a user-facing non-tt-call entry point. Thus the rigid key-value syntax need not be exposed to users of the public macro.

  • Before its key-value inputs, every rule must accept a $caller:tt.

    This is an opaque tt bundle used by tt_call! and tt_return! to record the call hierarchy. A tt_return! accepts a $caller to return back to.

  • Every rule must expand to exactly one macro invocation and nothing else.

    Output tokens are expected to be returned through tt_return!. Expanding to nothing, expanding to more than one macro invocation, or expanding to anything other than a macro invocation are not permitted.

Examples

Just as a flavor of the syntax, one of the rules from the implementation of the built-in tt_replace! macro is written as follows. The macro takes in a token stream and for each token that matches a given predicate it replaces that token with a given replacement sequence of tokens. For example the caller may want to replace the token self with the single token __value.

The rule shown here is responsible for performing one step of the replacement. It matches one token of input in $first:tt, uses tt_if! to invoke the predicate with $first as input, recurses with an accumulated copy of the replacement tokens if the predicate returns true, and recurses on the remaining tokens with $first preserved unchanged if the predicate returns false.

{
    $caller:tt
    condition = [{ $condition:ident }]
    replace_with = [{ $($with:tt)* }]
    tokens = [{ $($tokens:tt)* }]
    rest = [{ $first:tt $($rest:tt)* }]
} => {
    tt_if! {
        condition = [{ $condition }]
        input = [{ $first }]
        true = [{
            private_replace! {
                $caller
                condition = [{ $condition }]
                replace_with = [{ $($with)* }]
                tokens = [{ $($tokens)* $($with)* }]
                rest = [{ $($rest)* }]
            }
        }]
        false = [{
            private_replace! {
                $caller
                condition = [{ $condition }]
                replace_with = [{ $($with)* }]
                tokens = [{ $($tokens)* $first }]
                rest = [{ $($rest)* }]
            }
        }]
    }
};

Here is another macro rule selected from tt_replace!. This one matches if the tt-muncher has reached the end of its input. It returns the finished tokens back to the caller using tt_return!.

{
    $caller:tt
    condition = [{ $condition:ident }]
    replace_with = [{ $($with:tt)* }]
    tokens = [{ $($tokens:tt)* }]
    rest = [{ }]
} => {
    tt_return! {
        $caller
        tokens = [{ $($tokens)* }]
    }
};

One example of a caller-provided predicate for tt_replace! could be written as follows. This predicate determines whether the input token is lowercase self.

macro_rules! is_lowercase_self {
    // Input token is `self`.
    {
        $caller:tt
        input = [{ self }]
    } => {
        tt_return! {
            $caller
            is = [{ true }]
        }
    };

    // Input token is anything other than `self`.
    {
        $caller:tt
        input = [{ $other:tt }]
    } => {
        tt_return! {
            $caller
            is = [{ false }]
        }
    };
}

From here, calling tt_replace! with our is_lowercase_self! as the condition predicate can be used to implement a fanciful syntax for unary closures: closure!(self + 1) should expand to |__value| __value + 1.

Notice that this user-facing closure! macro does not follow the tt-call calling convention. Internally though it uses several tt-call helpers as building blocks.

macro_rules! closure {
    ($($expr:tt)+) => {
        |__value| tt_call! {
            macro = [{ tt_replace }]
            condition = [{ is_lowercase_self }]
            replace_with = [{ __value }]
            input = [{ $($expr)+ }]
        }
    };
}

fn main() {
    let add_one = closure!(self + 1);
    println!("{}", add_one(1));
}

Motivation

This may seem like a lot of ceremony around what should be very simple macro calls. After all, couldn't we write is_lowercase_self in a much more straightforward way as follows?

macro_rules! is_lowercase_self {
    (self) => { true };
    ($other:tt) => { false };
}

fn main() {
    println!("{}", is_lowercase_self!(self)); // true
    println!("{}", is_lowercase_self!(not_self)); // false
}

Qualified yes. As written, the simpler is_lowercase_self! behaves as it looks like it should.

But suppose we want to build tt_replace! or similar macro that needs to invoke is_lowercase_self! as a helper. There is no way to do it with this simpler one. No matter what our macro does, there is no way for it to expand is_lowercase_self! before expanding itself. If it expands itself first, there is no way for it to use the expansion of is_lowercase_self! to decide whether the current token is supposed to be replaced.

The tt_call! and tt_return! abstraction along with $caller:tt tracking of the call hierarchy are critical to building composable macros that freely pass around arbitrary tokens and return in a way that can inform expansion of their caller.

A future eager expansion feature for declarative macros may render the tt-call approach unnecessary. Eager expansion is listed as an unresolved question in the tracking issue for declarative macros 2.0 but is believed to be quite a ways out, if it ever happens. And even then it is not clear whether it is desirable to allow macros expanding to arbitrary tokens. Today macros always expand to an expression, item, statement, type, or pattern. Eager expansion does not automatically mean that the restriction would be lifted to allow a macro that expands to arbitrary tokens such as ! @ #. The token tree calling convention provides working eager expansion today with support for passing and returning arbitrary token streams.

And function-like procedural macros once those are stable? It is going to depend on your choice of syntax for the macro input whether a procedural macro is a better choice, but note that they present their own DIY parsing adventures and can be even nastier than tt-call once you get the hang of both. In addition, procedural macros must be defined in a separate crate from the rest of your library so they are not well suited for quick one-off helper macros.

Design philosphy

As may be no surprise by this point, the calling convention design prioritizes scalability and composability over conciseness. A reader familiar with the calling convention (maybe you, six months after writing the macro) should be able to look at any individual tt-call rule by itself and comfortably read off what it does top to bottom and identify its purpose.

  • The code that implements closure!(self + 1), all of which is shown above, can be found all together in examples/replace.rs.

  • As a more elaborate example of a tt-call macro, examples/comma_separated.rs demonstrates a macro that does primitive name mangling of Rust types. It uses parse_type! which is a tt-call version of $:ty.

    static MANGLED: &[&str] = mangle_type_names! {
        std::fs::File,
        &'a mut str,
        impl Display,
        fn(s: &str) -> String,
    };
    
    fn main() {
        assert_eq!(MANGLED, [
            "_std_fs_File",
            "_ref_mut_str",
            "_impl_Display",
            "_fn_s_ref_str_to_String"
        ]);
    }
    

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

No runtime deps