16 releases
new 0.1.0 | Mar 29, 2023 |
---|---|
0.0.15 | Oct 2, 2021 |
0.0.14 | Aug 24, 2021 |
0.0.13 | Jun 19, 2021 |
0.0.10 | Jul 27, 2020 |
#403 in Text processing
40 downloads per month
5MB
93K
SLoC
Seshat πππ
A Unicode Library for Rust.
Demo
Introduction
Seshat (pronounce as Sehs-hat) is a Unicode library that written in Rust. It provides many of Unicode character data and standard algorithms. The goal of this project is to provide a ICU-like library in Rust.
Version
Seshat follows the latest version of Unicode. Currently using version 15.0.0.
Usage
[dependencies]
seshat-unicode = "0.1.0"
use seshat::unicode::Ucd;
fn main() {
println!("π¦ is {}!", 'π¦'.na());
}
Check the Unicode Version
use seshat::unicode::UNICODE_VERSION;
fn main() {
println!("{}", UNICODE_VERSION.to_string());
}
Features
Grapheme cluster break
use seshat::unicode::Segmentation;
fn main() {
let s = "Hi, π¨πΎβπ€βπ¨πΏ";
for seg in s.break_graphemes() {
println!("{}", seg);
}
}
This will prints
$ cargo run
H
i
,
π¨πΎβπ€βπ¨πΏ
Normalization
use seshat::unicode::Normalization;
fn main() {
let s1 = "Γ
";
println!("{:?}", s1.to_nfd()); // Will prints "A\u{30a}"
let s2 = "γ";
println!("{}", s2.to_nfkd()); // Will prints γ’γγγΌγ
let s3 = "e\u{0301}";
println!("{}", s3.to_nfc()); // Will prints Γ©
let s4 = "アイウエ。";
assert_eq!("γ’γ€γ¦γ¨γͺ", s4.to_nfkc());
}
Contribute
Add later.
License
All logo images are copyright Frybits Inc. and should not be used out of this project without permission.
Seshat is developed under MIT License. For the detail, see the LICENSE file.