2 releases

0.1.1 Aug 7, 2021
0.1.0 Aug 7, 2021

#1041 in Text processing

MIT/Apache

16KB
261 lines

rune-based PanCJKV IVD Collection support

PanCJKV IVD Collection is an unregistered IVD collection, that makes use of Unicode Variation Selectors to distinguish CJK ideograph glyphs on a per-region basis.

This crate add support for PanCJKV IVD Collection support to rune-based iterators, by allowing unannotated CJK ideograph abstract characters be transformed into annotated form explicitly.

Example

use runestr::{rune, RuneString};
use runestr_pancjkv::{PanCJKVAnnotate, PanCJKVRegion}

fn main() {
    let test = RuneString::from_str_lossy("\u{6211}\u{030C}\u{4EEC}\u{E01EE}\u{0301}");
    assert_eq!(2, test.runes().count());
    let result = test
        .runes()
        .annotate_with_pan_cjkv_region(PanCJKVRegion::XK) // annotate with a presedo region called KangXi
        .collect::<RuneString>();
    assert_eq!(
        &result.chars().collect::<Vec<_>>()[..],
        &[
            '\u{6211}',
            '\u{E01EF}', // this variation selector is inserted
            '\u{030C}',
            '\u{4EEC}',
            '\u{E01EE}', // this is untouched
            '\u{0301}'
        ]
    );
    assert_eq!(2, result.runes().count()); // rune count does not change
}

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~1.5MB
~52K SLoC