5 releases (breaking)
0.9.0 | Mar 3, 2019 |
---|---|
0.8.0 | Jan 2, 2019 |
0.7.0 | Feb 7, 2018 |
0.6.0 | Sep 22, 2017 |
0.5.0 | Aug 5, 2017 |
#417 in Internationalization (i18n)
28,011 downloads per month
Used in 53 crates
(12 directly)
76KB
1K
SLoC
UNIC — UCD — Category
A component of unic
: Unicode and Internationalization Crates for Rust.
Unicode General_Category
.
The
General_Category
property of a code point provides for the most general classification of that code point. It is usually determined based on the primary characteristic of the assigned character for that code point. For example, is the character a letter, a mark, a number, punctuation, or a symbol, and if so, of what type? OtherGeneral_Category
values define the classification of code points which are not assigned to regular graphic characters, including such statuses as private-use, control, surrogate code point, and reserved unassigned.Many characters have multiple uses, and not all such cases can be captured entirely by the
General_Category
value. For example, theGeneral_Category
value of Latin, Greek, or Hebrew letters does not attempt to cover (or preclude) the numerical use of such letters as Roman numerals or in other numerary systems. Conversely, theGeneral_Category
of ASCII digits 0..9 as Nd (decimal digit) neither attempts to cover (or preclude) the occasional use of these digits as letters in various orthographies. TheGeneral_Category
is simply the first-order, most usual categorization of a character.For more information about the
General_Category
property, see Chapter 4, Character Properties in the Unicode Standard.