7 releases
new 0.0.7 | Dec 8, 2024 |
---|---|
0.0.6 | Dec 8, 2024 |
0.0.5 | Jun 2, 2023 |
0.0.4 | Oct 1, 2022 |
0.0.2 | Feb 15, 2021 |
#766 in Text processing
364 downloads per month
370KB
3.5K
SLoC
roe
Implements Unicode case mapping for conventionally UTF-8 binary strings.
Case mapping or case conversion is a process whereby strings are converted to a particular form—uppercase, lowercase, or titlecase—possibly for display to the user.
roe
can convert conventionally UTF-8 binary strings to capitalized, lowercase,
and uppercase forms. This crate is used to implement String#capitalize
,
Symbol#capitalize
, String#downcase
, Symbol#downcase
,
String#upcase
, and Symbol#upcase
in Artichoke Ruby.
This crate depends on bstr
.
Implementation
Roe generates conversion tables from Unicode Data Files. Roe implements case
mapping as defined in the Unicode standard (see PropList.txt
,
SpecialCasing.txt
, UnicodeData.txt
).
Status
This crate is currently a work in progress. When the API is complete, Roe will support lowercase, uppercase, titlecase, and case folding iterators for conventionally UTF-8 byte slices.
Roe will implement support for full, Turkic, ASCII, and case folding transforms.
Usage
Add this to your Cargo.toml
:
[dependencies]
roe = "0.0.7"
Then convert case like:
use roe::{LowercaseMode, UppercaseMode, TitlecaseMode};
assert_eq!(
roe::lowercase(b"Artichoke Ruby", LowercaseMode::Ascii).collect::<Vec<_>>(),
b"artichoke ruby"
);
assert_eq!(
roe::uppercase("Αύριο".as_bytes(), UppercaseMode::Full).collect::<Vec<_>>(),
"ΑΎΡΙΟ".as_bytes()
);
assert_eq!(
roe::titlecase("ffi".as_bytes(), TitlecaseMode::Full).collect::<Vec<_>>(),
"Ffi".as_bytes()
);
Crate Features
roe
is no_std
compatible with an optional dependency on the alloc
crate.
roe
has several Cargo features, all of which are enabled by default:
- std - Adds a dependency on
std
, the Rust Standard Library. This feature enablesstd::error::Error
implementations on error types in this crate. Enabling the std feature also enables the alloc feature. - alloc - Adds a dependency on
alloc
, the Rust allocation and collections library. This feature enables APIs that allocateString
orVec
.
Unicode Version
Roe implements Unicode case mapping with the Unicode 16.0.0 case mapping ruleset.
Each new release of Unicode may bring updates to the Data Files which are the source for the case mappings in this crate. Updates to the case mapping rules will be accompanied with a minor version bump.
License
roe
is licensed under the MIT License (c) Ryan Lopopolo.
roe
includes Unicode Data Files which are subject to the Unicode Terms of
Use and Unicode License v3 (c) 1991-2024 Unicode, Inc.
Dependencies
~465–620KB