5 releases (breaking)
|0.9.0||Mar 3, 2019|
|0.8.0||Jan 2, 2019|
|0.7.0||Feb 7, 2018|
|0.6.0||Sep 22, 2017|
|0.5.0||Aug 5, 2017|
#353 in Text processing
20,847 downloads per month
Used in 43 crates (12 directly)
UNIC — UCD — Category
A component of
unic: Unicode and Internationalization Crates for Rust.
General_Categoryproperty of a code point provides for the most general classification of that code point. It is usually determined based on the primary characteristic of the assigned character for that code point. For example, is the character a letter, a mark, a number, punctuation, or a symbol, and if so, of what type? Other
General_Categoryvalues define the classification of code points which are not assigned to regular graphic characters, including such statuses as private-use, control, surrogate code point, and reserved unassigned.
Many characters have multiple uses, and not all such cases can be captured entirely by the
General_Categoryvalue. For example, the
General_Categoryvalue of Latin, Greek, or Hebrew letters does not attempt to cover (or preclude) the numerical use of such letters as Roman numerals or in other numerary systems. Conversely, the
General_Categoryof ASCII digits 0..9 as Nd (decimal digit) neither attempts to cover (or preclude) the occasional use of these digits as letters in various orthographies. The
General_Categoryis simply the first-order, most usual categorization of a character.
For more information about the
General_Categoryproperty, see Chapter 4, Character Properties in the Unicode Standard.