3 releases
Uses new Rust 2024
| 0.0.3 | Aug 25, 2025 |
|---|---|
| 0.0.2 | Aug 25, 2025 |
| 0.0.1 | Aug 23, 2025 |
#1464 in Text processing
58 downloads per month
9KB
161 lines
RFC9839-rs
A rust implementation of RFC9839 to test for problematic Unicode code points
Inspired by the Go implementation https://github.com/timbray/RFC9839/tree/main
What is RFC9839
RFC9839 includes a few definition for accepted character classes.
-
Unicode Scalars
Any Unicode code point except high-surrogate and low-surrogate code points
-
Xml Characters
Unicode code points that excludes surrogates, legacy C0 controls, and the noncharacters U+FFFE and U+FFFF.
-
Unicode Assignables
Unicode code points that are not problematic. This, a proper subset of each of the others, comprises all code points that are currently assigned, excluding legacy control codes, or that might be assigned in the future.
Why this crate
-
no_stdThis crates does not make any allocations and thus can be used on embedded systems
-
const fnFunctions checking individual characters can all be called from a
constcontext. -
Well tested
Every character class is checked against the full
u32Range of possible values.