#string #utf-8 #bom #byte-string #strip #byte-stream #str

strip_bom

Add a simple BOM striping feature for str and String

1 stable release

1.0.0 Sep 10, 2020

#1998 in Encoding

Download history 2046/week @ 2024-12-01 4024/week @ 2024-12-08 2066/week @ 2024-12-15 359/week @ 2024-12-22 329/week @ 2024-12-29 4924/week @ 2025-01-05 4757/week @ 2025-01-12 2907/week @ 2025-01-19 4245/week @ 2025-01-26 4184/week @ 2025-02-02 5613/week @ 2025-02-09 3613/week @ 2025-02-16 5504/week @ 2025-02-23 5205/week @ 2025-03-02 7114/week @ 2025-03-09 8097/week @ 2025-03-16

26,296 downloads per month
Used in 2 crates

MIT license

4KB

strip_bom

Add a simple BOM striping feature for str and String.

Usage

use str_strip_bom::*;
// Or std::fs::read_to_string, surf::get, ...
let my_string: Vec<u8> = vec![ 0xefu8, 0xbb, 0xbf, 0xf0, 0x9f, 0x8d, 0xa3 ];
let my_string: String  = String::from_utf8( my_string ).unwrap();

// In this time, my_string has the BOM => true ๐Ÿฃ
println!( "{} {}", my_string.starts_with("\u{feff}"), &my_string );

// Strip BOM
let my_string: &str = my_string.strip_bom();

// my_string (slice) has not the BOM => false ๐Ÿฃ
println!( "{} {}", my_string.starts_with("\u{feff}"), &my_string );

Motivation

  1. I author wanted a simple and lightweight BOM stripper for only str and String, not for byte stream or the other of UTF-8 such as UTF-16 or UTF-32.
  2. Because, for example, serde and serde_json has no BOM supporting then it will be fail if I put a UTF-8 BOM source.
  3. The rust standard, str and Strings will not support a BOM stripping features.; See also https://github.com/rust-lang/rfcs/issues/2428.

Reference

License

Author

No runtime deps