#text #unicode #normalization #decomposition #recomposition

unicode-normalization

This crate provides functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15

17 releases

0.1.13 Jun 16, 2020
0.1.12 Jan 21, 2020
0.1.11 Nov 22, 2019
0.1.8 Jan 21, 2019
0.1.1 Jul 9, 2015

#14 in Text processing

Download history 208934/week @ 2020-06-08 211277/week @ 2020-06-15 224630/week @ 2020-06-22 216205/week @ 2020-06-29 218595/week @ 2020-07-06 188269/week @ 2020-07-13 199336/week @ 2020-07-20 216113/week @ 2020-07-27 212016/week @ 2020-08-03 209309/week @ 2020-08-10 204032/week @ 2020-08-17 206788/week @ 2020-08-24 194738/week @ 2020-08-31 198776/week @ 2020-09-07 204383/week @ 2020-09-14 197400/week @ 2020-09-21

879,979 downloads per month
Used in 6,811 crates (77 directly)

MIT/Apache

495KB
24K SLoC

unicode-normalization

Build Status Docs

Unicode character composition and decomposition utilities as described in Unicode Standard Annex #15.

This crate requires Rust 1.36+.

extern crate unicode_normalization;

use unicode_normalization::char::compose;
use unicode_normalization::UnicodeNormalization;

fn main() {
    assert_eq!(compose('A','\u{30a}'), Some('Å'));

    let s = "ÅΩ";
    let c = s.nfc().collect::<String>();
    assert_eq!(c, "ÅΩ");
}

crates.io

You can use this package in your project by adding the following to your Cargo.toml:

[dependencies]
unicode-normalization = "0.1.13"

no_std + alloc support

This crate is completely no_std + alloc compatible. This can be enabled by disabling the std feature, i.e. specifying default-features = false for this crate on your Cargo.toml.

Dependencies