5 releases (breaking)

0.6.0 May 17, 2022
0.5.0 Jan 31, 2022
0.4.0 Nov 2, 2021
0.3.0 Aug 2, 2021
0.2.0 Apr 29, 2021

#6 in #likely

Download history 2528/week @ 2023-12-18 587/week @ 2023-12-25 1589/week @ 2024-01-01 928/week @ 2024-01-08 2063/week @ 2024-01-15 1310/week @ 2024-01-22 3264/week @ 2024-01-29 1910/week @ 2024-02-05 2515/week @ 2024-02-12 768/week @ 2024-02-19 2073/week @ 2024-02-26 1778/week @ 2024-03-04 2664/week @ 2024-03-11 1438/week @ 2024-03-18 2615/week @ 2024-03-25 1441/week @ 2024-04-01

8,160 downloads per month
Used in icu_provider_cldr

Custom license

1MB
20K SLoC

icu_locale_canonicalizer crates.io

icu_locale_canonicalizer is one of the ICU4X components.

This API provides functionality to canonicalize locale identifiers based upon CLDR data.

It currently supports locale canonicalization based upon the canonicalization algorithm from UTS #35: Unicode LDML 3. LocaleId Canonicalization, as well as the minimize and maximize likely subtags algorithms as described in UTS #35: Unicode LDML 3. Likely Subtags.

The maximize method potentially updates a passed in locale in place depending up the results of running the 'Add Likely Subtags' algorithm from UTS #35: Unicode LDML 3. Likely Subtags.

This minimize method returns a new Locale that is the result of running the 'Remove Likely Subtags' algorithm from UTS #35: Unicode LDML 3. Likely Subtags.

Examples

use icu_locale_canonicalizer::{CanonicalizationResult, LocaleCanonicalizer};
use icu_locid::Locale;

let provider = icu_testdata::get_provider();
let lc = LocaleCanonicalizer::new(&provider)
    .expect("create failed");

let mut locale : Locale = "ja-Latn-fonipa-hepburn-heploc".parse()
    .expect("parse failed");
assert_eq!(lc.canonicalize(&mut locale), CanonicalizationResult::Modified);
assert_eq!(locale.to_string(), "ja-Latn-alalc97-fonipa");
use icu_locale_canonicalizer::{CanonicalizationResult, LocaleCanonicalizer};
use icu_locid::Locale;

let provider = icu_testdata::get_provider();
let lc = LocaleCanonicalizer::new(&provider)
    .expect("create failed");

let mut locale : Locale = "zh-CN".parse()
    .expect("parse failed");
assert_eq!(lc.maximize(&mut locale), CanonicalizationResult::Modified);
assert_eq!(locale.to_string(), "zh-Hans-CN");

let mut locale : Locale = "zh-Hant-TW".parse()
    .expect("parse failed");
assert_eq!(lc.maximize(&mut locale), CanonicalizationResult::Unmodified);
assert_eq!(locale.to_string(), "zh-Hant-TW");
use icu_locale_canonicalizer::{CanonicalizationResult, LocaleCanonicalizer};
use icu_locid::Locale;

let provider = icu_testdata::get_provider();
let lc = LocaleCanonicalizer::new(&provider)
    .expect("create failed");

let mut locale : Locale = "zh-Hans-CN".parse()
    .expect("parse failed");
assert_eq!(lc.minimize(&mut locale), CanonicalizationResult::Modified);
assert_eq!(locale.to_string(), "zh");

let mut locale : Locale = "zh".parse()
    .expect("parse failed");
assert_eq!(lc.minimize(&mut locale), CanonicalizationResult::Unmodified);
assert_eq!(locale.to_string(), "zh");

More Information

For more information on development, authorship, contributing etc. please visit ICU4X home page.

Dependencies

~0.7–1.4MB
~30K SLoC