1 stable release

1.0.0 May 21, 2019

#1164 in Text processing

Download history 5283/week @ 2023-12-05 4567/week @ 2023-12-12 3372/week @ 2023-12-19 2533/week @ 2023-12-26 2737/week @ 2024-01-02 2603/week @ 2024-01-09 2384/week @ 2024-01-16 2261/week @ 2024-01-23 4569/week @ 2024-01-30 3337/week @ 2024-02-06 3375/week @ 2024-02-13 3168/week @ 2024-02-20 3999/week @ 2024-02-27 4804/week @ 2024-03-05 5553/week @ 2024-03-12 5177/week @ 2024-03-19

20,117 downloads per month
Used in 2 crates

MIT/Apache

14KB
270 lines

detone

docs.rs Apache 2 / MIT dual-licensed

An iterator adapter that takes an iterator over char yielding a sequence of chars in Normalization Form C (this precondition is not checked!) and yields chars either such that tone marks that wouldn't otherwise fit into windows-1258 are decomposed or such that text is decomposed into orthographic units.

Use cases include preprocessing before encoding Vietnamese text into windows-1258 or converting precomposed Vietnamese text into a form that looks like it was written with the (non-IME) Vietnamese keyboard layout (e.g. for machine learning training or benchmarking purposes).

Licensing

Please see the file named COPYRIGHT.

Documentation

Generated API documentation is available online.

Release Notes

1.0.0

  • Initial release.

No runtime deps