#encoding #utf-8 #unicode #iterator

utf8_iter

Iterator by char over potentially-invalid UTF-8 in &[u8]

4 stable releases

Uses new Rust 2021

1.0.3 Sep 9, 2022
1.0.1 Jul 19, 2022
1.0.0 Apr 19, 2022

#171 in Text processing

Download history 2068/week @ 2022-08-07 3984/week @ 2022-08-14 3220/week @ 2022-08-21 5165/week @ 2022-08-28 4041/week @ 2022-09-04 2675/week @ 2022-09-11 4664/week @ 2022-09-18 5767/week @ 2022-09-25 2405/week @ 2022-10-02 3038/week @ 2022-10-09 1808/week @ 2022-10-16 1640/week @ 2022-10-23 1443/week @ 2022-10-30 1469/week @ 2022-11-06 1129/week @ 2022-11-13 1603/week @ 2022-11-20

5,659 downloads per month
Used in 10 crates (3 directly)

Apache-2.0 OR MIT

17KB
263 lines

utf8_iter

crates.io docs.rs

utf8_iter provides iteration by char over potentially-invalid UTF-8 &[u8] such that UTF-8 errors are handled according to the WHATWG Encoding Standard.

Key parts of the code are copypaste from the UTF-8 to UTF-16 conversion code in encoding_rs, which was optimized for speed in the case of valid input. The implementation here uses the structure that was found to be fast in the encoding_rs context but the structure hasn't been benchmarked in this context.

This is a no_std crate.

Licensing

TL;DR: Apache-2.0 OR MIT

Please see the file named COPYRIGHT.

Documentation

Generated API documentation is available online.

Release Notes

1.0.3

  • Fix an error in documentation.

1.0.2

  • char_indices() implementation.

1.0.1

  • as_slice() method.
  • Implement DoubleEndedIterator

1.0.0

The initial release.

No runtime deps