#regex #api #api-bindings #safe-bindings #tre

tre-regex

Rust safe bindings to the TRE regex module

6 releases (3 breaking)

0.4.1 Mar 15, 2024
0.4.0 Mar 15, 2024
0.3.0 Jun 17, 2023
0.2.1 Jun 16, 2023
0.1.1 Jun 15, 2023

#451 in Text processing

Download history 5/week @ 2024-02-22 1/week @ 2024-02-29 6/week @ 2024-03-07 205/week @ 2024-03-14 18/week @ 2024-03-21 24/week @ 2024-03-28 11/week @ 2024-04-04 61/week @ 2024-04-11 3/week @ 2024-04-18

99 downloads per month

BSD-2-Clause

79KB
938 lines

tre-regex

Safe API bindings to the TRE regex engine.

Documentation is available at docs.rs.

Should work on Rust 1.70.0 and up. Please report it if you discover otherwise.

Features

  • wchar: enable wchar support (not yet supported by the bindings, but will be enabled in tre-regex-sys). Enabled by default.
  • approx: enable approximate matching support. Enabled by default.
  • vendored: use the vendored copy of TRE with tre-regex-sys; otherwise use the system TRE. Enabled by default.

lib.rs:

These are safe bindings to the tre_regex_sys module.

These bindings are designed to provide an idiomatic Rust-like API to the TRE library as much as possible. Most of the TRE API is suported, except reguexec from TRE; that is tricky to implement, although should be fairly simple to use yourself.

This library uses Rust std::borrow::Cow strings to enable zero-copy of regex matches.

Examples

Two API's are presented: the function API, and the object API. Whichever one you choose to use is up to you, although the function API is implemented as a thin wrapper around the object API.

Object API

use tre_regex::{RegcompFlags, RegexecFlags, Regex};

let regcomp_flags = RegcompFlags::new().add(RegcompFlags::EXTENDED);
let regexec_flags = RegexecFlags::new().add(RegexecFlags::NONE);

let compiled_reg = Regex::new("^([[:alnum:]]+)[[:space:]]*([[:alnum:]]+)$", regcomp_flags)?;
let matches = compiled_reg.regexec("hello world", 2, regexec_flags)?;

for (i, matched) in matches.into_iter().enumerate() {
    match matched {
        Some(res) => {
            match res {
                Ok(substr) => println!("Match {i}: '{}'", substr),
                Err(e) => println!("Match {i}: <Error: {e}>"),
            }
        },
        None => println!("Match {i}: <None>"),
    }
}

Function API

use tre_regex::{RegcompFlags, RegexecFlags, regcomp, regexec};

let regcomp_flags = RegcompFlags::new().add(RegcompFlags::EXTENDED);
let regexec_flags = RegexecFlags::new().add(RegexecFlags::NONE);

let compiled_reg = regcomp("^([[:alnum:]]+)[[:space:]]*([[:alnum:]]+)$", regcomp_flags)?;
let matches = regexec(&compiled_reg, "hello world", 2, regexec_flags)?;

for (i, matched) in matches.into_iter().enumerate() {
    match matched {
        Some(res) => {
            match res {
                Ok(substr) => println!("Match {i}: '{}'", substr),
                Err(e) => println!("Match {i}: <Error: {e}>"),
            }
        },
        None => println!("Match {i}: <None>"),
    }
}

Dependencies

~0.5–2.4MB
~51K SLoC