4 releases
0.1.2 | Nov 3, 2024 |
---|---|
0.1.1 | Nov 3, 2024 |
0.1.0 | Nov 3, 2024 |
0.0.0 | Jul 17, 2024 |
#330 in Text processing
130KB
3K
SLoC
shwild
SHell-compatible WILDcards, for Rust.
Table of Contents
Introduction
shwild is a small, standalone library, implemented in C++ with a C and a C++ API, that provides shell-compatible wildcard matching.
shwild.Rust is a Rust port, with minimal API differences. The design emphasis is on simplicity-of-use, modularity, and performance.
let pattern = r"Where are the* [πΌπ»]s\?";
assert_eq!(Ok(false), shwild_matches!(pattern, ""));
assert_eq!(Ok(false), shwild_matches!(pattern, "Where are the bears?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the π»s?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the πΌs?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are their π»s?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the big brown π»s?"));
assert_eq!(Ok(false), shwild_matches!(pattern, "Where are the teddy-π»s?"));
(See Examples section for more examples.)
Pattern Elements
The library (and other shwild variants) support the following pattern elements:
- Literal - a non-empty string fragment, as in
"Where are the"
, which matches the exact same string fragment in the input; - Wild-1 - represented by the single character
'?'
in the pattern, which represents a match of exactly any one character. In the above exampler"Where are the* [πΌπ»]s\?"
the'?'
is not interpreted as a wild-1 because it is escaped by the'\'
character and instead part of the literal fragment"s?"
; - Wild-N - represented by the single character
'*'
in the pattern, which represents a match of any number of characters; - Range - represented by a sequence of characters within
'['
and']'
, as in the"[πΌπ»]"
fragment in the above example, which will match to any one of range character in the input. As well as an unordered sequence of literal characters, ranges may also capture contiguous sequences, as in"[zc-aja]"
(any of characters'a'
,'b'
,'c'
,'j'
,'z'
) or in"[abm-PrZ]"
(any of characters'a'
,'b'
,'m'
,'M'
,'n'
,'N'
,'o'
,'O'
,'p'
,'P'
,'r'
,'Z'
); - Not-range - represented in the same form as a Range but where the first range character is
'^'
and the remaining characters represent a set of characters that cannot appear (at the requisite position) in the input;
Installation
Reference in Cargo.toml in the usual way:
shwild = { version = "~0.1" }
Components
Constants
The constant IGNORE_CASE
causes matching to ignore case.
Enumerations
The shwild::Error
enum is used to represent a parse result, defined as:
pub enum Error {
/// Parse error encountered.
ParseError {
line : usize,
column : usize,
message : String,
},
}
The shwild::Result
enum is a specialized std::result::Result
type for shwild, defined as:
pub type Result<T> = std_result::Result<T, shwild::Error>;
Features
The following crate features are defined:
Name | Effect | Is "default" ? |
Dependent feature(s) |
---|---|---|---|
"lookup-ranges" |
Causes match/non-match ranges to be implemented in terms of UnicodePointMap (from collect-rs crate), resulting in significant performance improvements in parsing and matching |
Yes | |
"test-regex" |
Introduces a dependency to regex crate to support benchmark/example program(s) | No |
Functions
The shwild::matches()
function attempts to parse a pattern
according to flags
and then match against it the string input
.
pub mod shwild {
pub fn matches(
pattern : &str,
input : &str,
flags : i64,
) -> Result<bool>;
}
Macros
The shwild::shwild_matches!()
macro is a shorthand for the shwild::matches()
function, providing 2-parameter and 3-parameter forms. The 2-parameter form passes 0 for the flags
parameter.
Structures
The shwild::CompiledMatcher
structure is the data structure that is used to parse the pattern and then test the input string. Because there is a small, but non-zero, cost to parsing patterns - and complex patterns more so, of course - so if matching is to be repeated in a context where performance costs matter then you may prefer to create an instance of CompiledMatcher
and then use it to test against, as in:
let pattern = r"Where are the* [πΌπ»]s\?";
let flags = 0;
let matcher = shwild::CompiledMatcher::from_pattern_and_flags(pattern, flags).unwrap();
assert!(!matcher.matches(""));
assert!(!matcher.matches("Where are the bears?"));
assert!( matcher.matches("Where are the π»s?"));
assert!( matcher.matches("Where are the πΌs?"));
assert!( matcher.matches("Where are their π»s?"));
assert!( matcher.matches("Where are the big brown π»s?"));
assert!(!matcher.matches("Where are the teddy-π»s?"));
If you are ever need to get an understanding about the parsed state you can use the Debug
implementation for the CompiledMatcher
, as in:
// a pattern for rudimentary Windows path names
let pattern = r"[A-Z]\?*\?*.[ce][ox][em]";
let matcher = shwild::CompiledMatcher::from_pattern_and_flags(pattern, flags).unwrap();
eprintln!("matcher={matcher:?}");
Traits
No public traits are defined at this time.
Examples
T.B.C.
Project Information
Where to get help
Contribution guidelines
Defect reports, feature requests, and pull requests are welcome on https://github.com/synesissoftware/shwild.Rust.
Dependencies
shwild.Rust has two dependencies, both optional:
- collect-rs - required, for more efficient range matching, if feature
"lookup-ranges"
is specified; - regex - required, by some benchmark/example programs only, if feature
"test-regex"
is specified;
Dev Dependencies
Crates upon which shwild has development dependencies:
Related projects
None at this time.
License
shwild is released under the 3-clause BSD license. See LICENSE for details.
Dependencies
~0β550KB