2 stable releases

1.0.1 Jul 4, 2024
1.0.0 May 21, 2019

#290 in Text processing

Download history 8324/week @ 2024-08-14 7045/week @ 2024-08-21 4810/week @ 2024-08-28 7394/week @ 2024-09-04 6475/week @ 2024-09-11 5733/week @ 2024-09-18 6458/week @ 2024-09-25 5310/week @ 2024-10-02 6278/week @ 2024-10-09 6316/week @ 2024-10-16 5269/week @ 2024-10-23 6141/week @ 2024-10-30 6373/week @ 2024-11-06 5813/week @ 2024-11-13 3902/week @ 2024-11-20 3052/week @ 2024-11-27

20,754 downloads per month
Used in 2 crates

Apache-2.0 OR MIT

18KB
269 lines

detone

docs.rs Apache 2 / MIT dual-licensed

An iterator adapter that takes an iterator over char yielding a sequence of chars in Normalization Form C (this precondition is not checked!) and yields chars either such that tone marks that wouldn't otherwise fit into windows-1258 are decomposed or such that text is decomposed into orthographic units.

Use cases include preprocessing before encoding Vietnamese text into windows-1258 or converting precomposed Vietnamese text into a form that looks like it was written with the (non-IME) Vietnamese keyboard layout (e.g. for machine learning training or benchmarking purposes).

Licensing

Please see the file named COPYRIGHT.

Documentation

Generated API documentation is available online.

MSRV

1.60 to use, 1.67 to run tests. Pin version 1.0.0 of this crate if you need an even lower MSRV; there are no non-test changes.

Release Notes

1.0.1

  • Updated metadata, internal documentation, and the dev dependency.
  • No non-test code changes.

1.0.0

  • Initial release.

No runtime deps