#text #unicode #normalization #decomposition #recomposition

no-std unicode-normalization

This crate provides functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15

27 releases

new 0.1.23 Feb 20, 2024
0.1.22 Sep 16, 2022
0.1.21 Jul 1, 2022
0.1.19 Jun 2, 2021
0.1.1 Jul 9, 2015

#92 in Text processing

Download history 1391677/week @ 2023-11-01 1444629/week @ 2023-11-08 1505015/week @ 2023-11-15 1326964/week @ 2023-11-22 1537732/week @ 2023-11-29 1496352/week @ 2023-12-06 1467602/week @ 2023-12-13 1001203/week @ 2023-12-20 960543/week @ 2023-12-27 1401149/week @ 2024-01-03 1420572/week @ 2024-01-10 1573094/week @ 2024-01-17 1495681/week @ 2024-01-24 1639147/week @ 2024-01-31 1582692/week @ 2024-02-07 1274951/week @ 2024-02-14

6,266,114 downloads per month
Used in 25,263 crates (178 directly)

MIT/Apache

700KB
39K SLoC

unicode-normalization

Build Status Docs

Unicode character composition and decomposition utilities as described in Unicode Standard Annex #15.

This crate requires Rust 1.36+.

extern crate unicode_normalization;

use unicode_normalization::char::compose;
use unicode_normalization::UnicodeNormalization;

fn main() {
    assert_eq!(compose('A','\u{30a}'), Some('Å'));

    let s = "ÅΩ";
    let c = s.nfc().collect::<String>();
    assert_eq!(c, "ÅΩ");
}

crates.io

You can use this package in your project by adding the following to your Cargo.toml:

[dependencies]
unicode-normalization = "0.1.23"

no_std + alloc support

This crate is completely no_std + alloc compatible. This can be enabled by disabling the std feature, i.e. specifying default-features = false for this crate on your Cargo.toml.

Dependencies