2 unstable releases
new 0.2.0 | Apr 4, 2025 |
---|---|
0.1.0 | Apr 4, 2025 |
#667 in Text processing
26 downloads per month
46KB
994 lines
Rusty Regex
Rusty Regex is a minimalistic regex engine written in Rust. It compiles a regex pattern into an Abstract Syntax Tree (AST), then builds a Non-deterministic Finite Automaton (NFA), converts that NFA into a Deterministic Finite Automaton (DFA) via subset construction, and finally uses the DFA to match input strings.
Supported Features
- Literals & Wildcards
- Literal characters (e.g.,
"abc"
) - Wildcard
.
matching any single character
- Literal characters (e.g.,
- Character Classes
- Explicit classes: e.g.,
[abc]
- Ranges: e.g.,
[a-c]
- Explicit classes: e.g.,
- Predefined Classes
\d
for digits\w
for word characters (letters, digits, underscore)
- Quantifiers
- Zero or more (
*
) - One or more (
+
) - Zero or one (
?
) - Exact count (
{n}
) - Range (
{m,n}
or{m,}
)
- Zero or more (
- Grouping & Capturing
- Parentheses for grouping (all groups are capturing by default)
- Alternation
- The pipe operator (
|
) for alternatives
- The pipe operator (
- Anchors
- Start (
^
) and end ($
) anchors to match input positions
- Start (
Design Overview
Rusty Regex works in several phases:
- Parsing: Converts the regex string into an AST.
- NFA Construction: Transforms the AST into an NFA.
- DFA Conversion: Converts the NFA into a DFA using subset construction.
- Matching: Simulates the DFA over input strings (scanning for matches).
Example Usage
Below is a simple example demonstrating how to compile a regex and test input strings:
use rusty_regex::Regex;
fn main() {
// Compile a regex pattern.
let regex = Regex::new("^(a|b)*c$").unwrap();
// Test various inputs.
let tests = vec![
("aaabc", true),
("bbc", true),
("c", true),
("aaab", false),
];
for (input, expected) in tests {
let result = regex.is_match(input);
println!("Input: {:?}, Match: {}, Expected: {}", input, result, expected);
}
}