#regex #hyperscan #streaming

hyperscan

Hyperscan bindings for Rust with Multiple Pattern and Streaming Scan

10 releases

0.2.2 Jun 8, 2021
0.2.1 Mar 23, 2021
0.2.0 Jun 10, 2020
0.1.8 May 21, 2019
0.1.1 Nov 1, 2016

#94 in Text processing

Download history 451/week @ 2021-02-25 617/week @ 2021-03-04 685/week @ 2021-03-11 699/week @ 2021-03-18 729/week @ 2021-03-25 367/week @ 2021-04-01 553/week @ 2021-04-08 594/week @ 2021-04-15 448/week @ 2021-04-22 710/week @ 2021-04-29 726/week @ 2021-05-06 734/week @ 2021-05-13 930/week @ 2021-05-20 531/week @ 2021-05-27 444/week @ 2021-06-03 655/week @ 2021-06-10

2,660 downloads per month
Used in grep-hyperscan

Apache-2.0

305KB
6K SLoC

rust-hyperscan travis crate docs

Hyperscan is a high-performance regular expression matching library.

Usage

To use, add the following line to Cargo.toml under [dependencies]:

hyperscan = "0.2"

Examples

use hyperscan::prelude::*;

fn main() {
    let pattern = pattern! {"test"; CASELESS | SOM_LEFTMOST};
    let db: BlockDatabase = pattern.build().unwrap();
    let scratch = db.alloc_scratch().unwrap();
    let mut matches = vec![];

    db.scan("some test data", &scratch, |id, from, to, flags| {
        println!("found pattern #{} @ [{}, {})", id, from, to);

        matches.push(from..to);

        Matching::Continue
    }).unwrap();

    assert_eq!(matches, vec![5..9]);
}

Features

Hyperscan v5 API

Starting with Hyperscan v5.0, several new APIs and flags have been introduced.

rust-hyperscan uses the latest version of the API by default, providing new features such as Literal.

If you want to work with Hyperscan v4.x, you can disable v5 feature at compile time.

[dependencies.hyperscan]
version = "0.2"
default-features = false
features = ["full"]

Chimera API

In order to improve regular expression compatibility, Hyperscan v5.0 starts to provide a PCRE-compatible Chimera library.

To enable Chimera support, you need to manually download PCRE 8.41 or above, unzip to the source directory of Hyperscan 5.x, compile and install it.

$ cd hyperscan-5.3.0
$ wget https://ftp.pcre.org/pub/pcre/pcre-8.44.tar.gz
$ mkdir pcre
$ tar xvf pcre-8.44.tar.gz --strip-components=1 --directory pcre

$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=`pwd`
$ make

Then point to the hyperscan installation directory with the HYPERSCAN_ROOT environment variable to enable chimera feature.

$ HYPERSCAN_ROOT=<CMAKE_INSTALL_PREFIX> cargo build

The chimera feature should be enabled.

[dependencies]
hyperscan = { version = "0.2", features = ["chimera"] }

Note: The Chimera library does not support dynamic library linking mode, static feature is automatically enabled when chimera is enabled.

Static Linking Mode

As of version 0.2, rust-hyperscan uses dynamic library linking mode by default. If you need link a static library, you can use the static feature.

[dependencies]
hyperscan = { version = "0.2", features = ["static"] }

Hyperscan Runtime

Hyperscan provides a standalone runtime library, which can be used separately. If you don't need to compile regular expressions at runtime, you can reduce the size of the executable using runtime mode and get rid of C++ dependencies.

[dependencies.hyperscan]
version = "0.2"
default-features = false
features = ["runtime"]

Dependencies

~0.6–1MB
~24K SLoC