2 releases
Uses old Rust 2015
0.1.1 | Oct 8, 2017 |
---|---|
0.1.0 | Oct 8, 2017 |
#1843 in Text processing
Used in 2 crates
160KB
105 lines
Unicode character "confusable" detection and "skeleton" computation, specified by the Unicode Standard Annex #39. These functions are for working with strings that appear nearly identical once rendered, but do not compare as equal.
extern crate unicode_skeleton;
use unicode_skeleton::{UnicodeSkeleton, confusable};
fn main() {
assert_eq!("ππΆα»Ώπ‘πβ".skeleton_chars().collect::<String>(), "paypal");
assert!(confusable("βπππ", "Rust"));
}
crates.io
Adding the following to your Cargo.toml
to use:
[dependencies]
unicode_skeleton = "0.1.0"
lib.rs
:
Transforms a unicode string by replacing unusual characters with similar-looking common characters, as specified by the Unicode Standard Annex #39. For example, "βπππ" will be transformed to "Rust". This simplified string is called the "skeleton".
use unicode_skeleton::UnicodeSkeleton;
"βπππ".skeleton_chars().collect::<String>() // "Rust"
Strings are considered "confusable" if they have the same skeleton. For example, "βπππ" and "Rust" are confusable.
use unicode_skeleton::confusable;
confusable("βπππ", "Rust") // true
The translation to skeletons is based on Unicode Security Mechanisms for UTR #39 version 10.0.0.
Dependencies
~790KB
~40K SLoC