#unicode #category #member #unicode-characters #chars #membership #query

unicode_categories_zheyuye

Query Unicode category membership for chars

1 unstable release

Uses old Rust 2015

0.1.2 Aug 14, 2023

#981 in Text processing

MIT/Apache

460KB
4.5K SLoC

unicode-categories

unicode-categories is a simple crate that adds many extensions to the char type that allow for the querying of whether or not a character is a member of a certain category of unicode characters.

use unicode_categories::UnicodeCategories;

assert!('a'.is_letter_lowercase());
assert!('A'.is_letter_uppercase());
assert!('\n'.is_other_control());

This crate can be obtained from crates.io and used as a dependency like any other crate.

Regenerating the tables

The script in scripts/unicode.py reads a file named UnicodeData.txt, which is assumed to be in the current working directory. This file can be obtained from ftp://ftp.unicode.org . The script outputs a rust file to standard out, which can be piped to the desired location.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

No runtime deps