#pattern-matching #glob-pattern #regex #pattern #glob #fnmatch

fnmatch-regex

Convert a glob-style pattern to a regular expression

3 unstable releases

0.2.1 Oct 12, 2024
0.2.0 Jun 11, 2022
0.1.0 Jun 22, 2021

#263 in Encoding

Download history 532/week @ 2024-08-20 532/week @ 2024-08-27 650/week @ 2024-09-03 607/week @ 2024-09-10 452/week @ 2024-09-17 422/week @ 2024-09-24 353/week @ 2024-10-01 473/week @ 2024-10-08 443/week @ 2024-10-15 470/week @ 2024-10-22 847/week @ 2024-10-29 596/week @ 2024-11-05 566/week @ 2024-11-12 560/week @ 2024-11-19 491/week @ 2024-11-26 480/week @ 2024-12-03

2,173 downloads per month
Used in 4 crates (3 directly)

BSD-2-Clause

34KB
722 lines

fnmatch-regex - build regular expressions to match glob-style patterns

[Home | GitLab | crates.io | ReadTheDocs]

Overview

This crate currently provides a single function, glob_to_regex, that converts a glob-style pattern with some shell extensions to a regular expression. Note that it only handles text pattern matching, there are no attempts to verify or construct any filesystem paths.

The glob-style pattern features currently supported are:

  • any character except ?, *, [, \, or { is matched literally

  • ? matches any single character except a slash (/)

  • * matches any sequence of zero or more characters that does not contain a slash (/)

  • a backslash allows the next character to be matched literally, except for the \a, \b, \e, \n, \r, and \v sequences

  • a [...] character class supports ranges, negation if the very first character is !, backslash-escaping, and also matching a ] character if it is the very first character possibly after the ! one (e.g. []] would only match a single ] character)

  • an {a,bbb,cc} alternation supports backslash-escaping, but not nested alternations or character classes yet

Note that the * and ? wildcard patterns, as well as the character classes, will never match a slash.

Examples

  • abc.txt would only match abc.txt

  • foo/test?.txt would match e.g. foo/test1.txt or foo/test".txt, but not foo/test/.txt

  • /etc/c[--9].conf would match e.g. /etc/c-.conf, /etc/c..conf, or /etc/7.conf, but not /etc/c/.conf

  • linux-[0-9]*-{generic,aws} would match linux-5.2.27b1-generic and linux-4.0.12-aws, but not linux-unsigned-5.2.27b1-generic

Note that the negation modifier for character classes is !, not ^.

let re_name = fnmatch_regex::glob_to_regex("linux-[0-9]*-{generic,aws}")?;
for name in &[
    "linux-5.2.27b1-generic",
    "linux-4.0.12-aws",
    "linux-unsigned-5.2.27b1-generic"
] {
    let okay = re_name.is_match(name);
    println!(
        "{}: {}",
        name,
        match okay { true => "yes", false => "no" },
    );
    assert!(okay == !name.contains("unsigned"));
}

Contact

The fnmatch-regex library was written by Peter Pentchev. It is developed in a GitLab repository. This documentation is hosted at Ringlet with a copy at ReadTheDocs.

Dependencies

~3–4.5MB
~81K SLoC