2 unstable releases
Uses old Rust 2015
0.2.0 | Jan 23, 2018 |
---|---|
0.1.0 | Jan 21, 2018 |
#21 in #matcher
11KB
201 lines
nanore
nanore is a tiny (≃ 0.2K SLOC) regular expression matcher for arbitrary data types written in Rust.
- O(|regexp||sequence|) time and O(|regexp|) space. No exponential blowup.
- Support online matching.
- Matching states can be copied at any time.
- Support regex weighting.
- Support path marking.
Example
Basics
[source, rust]
use nanore::*;
let re = RegExRoot::<_, ()>::new(
atom(|_, e| *e % 2 != 0) * atom(|_, e| *e % 2 == 0) + rep(atom(|_, e| *e > 0))
);
let mut m = Matcher::new(&re);
assert!(m.is_match());
m.feed(&1);
assert!(m.is_match());
m.feed(&2);
assert!(m.is_match());
m.feed(&3);
assert!(m.is_match());
m.feed(&0);
assert!(!m.is_match());
Constructs: *
(concatenation), +
(alternation), rep()
, opt()
, eps()
,
atom()
, any()
, val()
, weight()
, mark()
.
Mark & Path
[source, rust]
#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Foo, Bar }
let re = RegExRoot::new(
rep(mark(Marker::Foo) * val('a') + mark(Marker::Bar) * val('b'))
);
let mut m = Matcher::new(&re);
m.feed(&'a');
m.feed(&'b');
m.feed(&'a');
m.feed(&'b');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Foo), (1, Marker::Bar), (2, Marker::Foo), (3, Marker::Bar)]);
Weight
[source, rust]
let re = RegExRoot::new(
rep(mark(Marker::Foo) * val('a')) * rep(weight(-1) * mark(Marker::Bar) * val('a'))
);
let mut m = Matcher::new(&re);
m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
m.feed(&'a');
assert!(m.is_match());
assert!(m.path() == [(0, Marker::Bar), (1, Marker::Bar), (2, Marker::Bar), (3, Marker::Bar)]);
Application: Find longest fibonacci sequence
[source, rust]
#[derive(Clone, Copy, PartialEq, Eq)]
enum Marker { Bgn, End }
let xs = [1, 1, 2, 3, 5, 3, 2, 3, 5, 8, 13, 21, 34];
// ^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^
let re = RegExRoot::new(
rep(weight(1) * any()) * mark(Marker::Bgn) *
any() * any() * rep(atom(|i, x| *x == xs[i - 2] + xs[i - 1])) *
mark(Marker::End) * rep(weight(1) * any())
);
let mut m = Matcher::new(&re);
m.feed_iter(&xs);
assert!(m.path() == [(6, Marker::Bgn), (13, Marker::End)]);
Reference
Weighted RegExp Matching::
nanore uses (the subset of) the method in this paper. Note that the idea
is simple and elegant, but there are some non-trivial parts due to
ε-transitions in implicitly generated ε-NFA (see empty
, final
and
shift
). nanore handles normal transitions and ε-transitions separately,
which seems a bit different from this paper (see shift()
and
propagate()
in nanore).
関数型的正規表現マッチ::
The excellent article about the paper above, in Japanese.