1 stable release

1.0.0 May 21, 2019

#1052 in Text processing

Download history 5849/week @ 2023-06-05 5772/week @ 2023-06-12 5612/week @ 2023-06-19 5294/week @ 2023-06-26 6427/week @ 2023-07-03 6294/week @ 2023-07-10 6801/week @ 2023-07-17 6100/week @ 2023-07-24 6578/week @ 2023-07-31 8682/week @ 2023-08-07 6573/week @ 2023-08-14 7537/week @ 2023-08-21 5584/week @ 2023-08-28 4740/week @ 2023-09-04 4287/week @ 2023-09-11 6465/week @ 2023-09-18

21,618 downloads per month
Used in 2 crates

MIT/Apache

14KB
270 lines

detone

crates.io docs.rs Apache 2 / MIT dual-licensed

An iterator adapter that takes an iterator over char yielding a sequence of chars in Normalization Form C (this precondition is not checked!) and yields chars either such that tone marks that wouldn't otherwise fit into windows-1258 are decomposed or such that text is decomposed into orthographic units.

Use cases include preprocessing before encoding Vietnamese text into windows-1258 or converting precomposed Vietnamese text into a form that looks like it was written with the (non-IME) Vietnamese keyboard layout (e.g. for machine learning training or benchmarking purposes).

Licensing

Please see the file named COPYRIGHT.

Documentation

Generated API documentation is available online.

Release Notes

1.0.0

  • Initial release.

No runtime deps