#brackets #unicode #matching #txt #btree-map #generate #open-close

unicode-matching

Rust library crate to match Unicode open/close brackets

10 releases (4 breaking)

new 0.5.4 Mar 10, 2025
0.5.3 Mar 10, 2025
0.4.0 Mar 10, 2025
0.3.0 Mar 9, 2025
0.1.1 Feb 20, 2025

#579 in Text processing

Download history 251/week @ 2025-02-19 13/week @ 2025-02-26 623/week @ 2025-03-05

887 downloads per month

MIT license

47KB
1.5K SLoC

Rust library crate to match Unicode open/close brackets

Source is generated by a Perl script (bin/matching.pl) to download and parse the following Unicode database files.

Original idea is from this StackOverflow thread and this comment.

Example

// Use the `FindMatching` trait
use unicode_matching::FindMatching;

// Generate the close/open `BTreeMap`s
let close = unicode_matching::close();
let open = unicode_matching::open();

let s = "fn main() {\n    println!(\"Hello!\");\n}";
//       000000000011 11111111222222 2222333 333 33
//       012345678901 23456789012345 6789012 345 67

// Match the open/close parentheses in `main()`
assert_eq!(s.find_matching(7, &close, &open), 8);
assert_eq!(s.find_matching(8, &close, &open), 7);

// Match the open/close curly braces in `main() {...}`
assert_eq!(s.find_matching(10, &close, &open), 36);
assert_eq!(s.find_matching(36, &close, &open), 10);

// Match the open/close parentheses in `println!("...")`
assert_eq!(s.find_matching(24, &close, &open), 33);
assert_eq!(s.find_matching(33, &close, &open), 24);

let length = s.len();
let more = length + 1;

// Any other index (whether valid or invalid) returns itself
assert_eq!(s.find_matching(0, &close, &open), 0);
assert_eq!(s.find_matching(length, &close, &open), length);
assert_eq!(s.find_matching(more, &close, &open), more);

// Note that regular/straight single or double quotes do not match because they aren't valid
// matching graphemes according to Unicode
assert_eq!(s.find_matching(25, &close, &open), 25);
assert_eq!(s.find_matching(32, &close, &open), 32);

// ... but they can be added manually and then it works
use std::collections::BTreeMap;
let close = unicode_matching::close().into_iter().chain([("'", "'"), ("\"", "\"")]).collect::<BTreeMap<_, _>>();
let open = unicode_matching::open().into_iter().chain([("'", "'"), ("\"", "\"")]).collect::<BTreeMap<_, _>>();
assert_eq!(s.find_matching(25, &close, &open), 32);
assert_eq!(s.find_matching(32, &close, &open), 25);

Dependencies

~0.6–1.3MB
~25K SLoC