13 releases (6 stable)
1.3.0 | Sep 12, 2024 |
---|---|
1.2.2 | Mar 10, 2024 |
1.2.1 | Dec 14, 2023 |
1.2.0 | Oct 14, 2023 |
0.2.2 | Jun 17, 2018 |
#65 in Text processing
115,517 downloads per month
Used in 88 crates
(18 directly)
335KB
1K
SLoC
unicode_names2
Time and memory efficiently mapping characters to and from their Unicode 16.0 names, at runtime and compile-time.
fn main() {
println!("☃ is called {}", unicode_names2::name('☃')); // SNOWMAN
println!("{} is happy", unicode_names2::character("white smiling face")); // ☺
// (NB. case insensitivity)
}
The maps are compressed using similar tricks to Python's unicodedata
module, although those here are about 70KB (12%) smaller.
lib.rs
:
Convert between characters and their standard names.
This crate provides two functions for mapping from a char
to the
name given by the Unicode standard (16.0). There are no runtime
requirements so this is usable with only core
(this requires
specifying the no_std
cargo feature). The tables are heavily
compressed, but still large (500KB), and still offer efficient
O(1)
look-ups in both directions (more precisely, O(length of name)
).
println!("☃ is called {:?}", unicode_names2::name('☃')); // SNOWMAN
println!("{:?} is happy", unicode_names2::character("white smiling face")); // ☺
// (NB. case insensitivity)
Macros
The associated unicode_names2_macros
crate provides two macros
for converting at compile-time, giving named literals similar to
Python's "\N{...}"
.
named_char!(name)
takes a single stringname
and creates achar
literal.named!(string)
takes a string and replaces any\\N{name}
sequences with the character with that name. NB. String escape sequences cannot be customised, so the extra backslash (or a raw string) is required, unless you use a raw string.
#![feature(proc_macro_hygiene)]
#[macro_use]
extern crate unicode_names2_macros;
fn main() {
let x: char = named_char!("snowman");
assert_eq!(x, '☃');
let y: &str = named!("foo bar \\N{BLACK STAR} baz qux");
assert_eq!(y, "foo bar ★ baz qux");
let y: &str = named!(r"foo bar \N{BLACK STAR} baz qux");
assert_eq!(y, "foo bar ★ baz qux");
}
Cargo-enabled
This package is on crates.io, so add either (or both!) of the
following to your Cargo.toml
.
[dependencies]
unicode_names2 = "0.2.1"
unicode_names2_macros = "0.2"