4 releases

0.2.0 Sep 25, 2023
0.1.2 Sep 18, 2023
0.1.1 Sep 13, 2023
0.1.0 Sep 12, 2023

#828 in Text processing

Download history 34/week @ 2024-02-18 36/week @ 2024-02-25 104/week @ 2024-03-03 26/week @ 2024-03-10 1/week @ 2024-03-17 39/week @ 2024-03-24 63/week @ 2024-03-31

135 downloads per month

MIT/Apache

26KB
460 lines

trie_match! {}

Crates.io Documentation Rust Build Status Slack

This macro speeds up Rust's match expression for comparing strings by using a compact double-array data structure.

Usage

Simply wrap the existing match expression with the trie_match! {} macro as follows:

use trie_match::trie_match;

let x = "abd";

let result = trie_match! {
    match x {
        "a" => 0,
        "abc" => 1,
        pat @ ("abd" | "bcc") => pat.bytes()[0],
        "bc" => 3,
        _ => 4,
    }
};

assert_eq!(result, 3);

Why is it faster?

In a normal match expression, the string is compared for each pattern. It is equivalent to the following code:

if x == "a" {
    0
} else if x == "abc" {
    1
} else if x == "abd" || x == "bcc" {
    x.bytes()[0]
} else if x == "bc" {
    3
} else {
    4
}

The above code requires that string comparisons be made from the beginning of the string each time. The time complexity of the above code is O(mn), where m is the average pattern length, and n is the number of patterns.

In contrast, this macro builds the following trie structure to retrieve the index of the matched arm:

Trie

Furthermore, this implementation uses the compact double-array data structure to achieve efficient state-to-state traversal, and the time complexity becomes O(m).

cfg attribute

Only when using Nightly Rust, this macro supports conditional compilation with the cfg attribute. To use this feature, enable features = ["cfg_attribute"] in your Cargo.toml.

Example

trie_match! {
    match x {
        #[cfg(feature = "foo")]
        "a" => { .. }
        #[cfg(feature = "bar")]
        "b" => { .. }
        _ => { .. }
    }
}

Limitations

The followings are different from the normal match expression:

  • Only supports strings, byte strings, and u8 slices as patterns.
  • The wildcard is evaluated last. (The normal match expression does not match patterns after the wildcard.)
  • Guards are unavailable.

Sometimes the normal match expression is faster, depending on how optimization is performed, so it is better to choose based on your speed experiments.

Benchmark

Run the following command:

cargo bench

Experimental results are as follows [μs]:

  • AMD Ryzen 7 5700U with Radeon Graphics

    Bench name Normal match phf crate trie-match crate
    100 words random 1.94 2.02 1.09
    HTML elements random 2.32 2.43 0.55
  • 12th Gen Intel(R) Core(TM) i7-1270P

    Bench name Normal match phf crate trie-match crate
    100 words random 1.13 1.29 0.61
    HTML elements random 1.24 1.51 0.36

phf crate: Compile time static maps using perfect hash functions.

License

Licensed under either of

at your option.

Contribution

See the guidelines.

Dependencies

~335–790KB
~19K SLoC