1 stable release
1.0.0 | Sep 10, 2020 |
---|
#1952 in Encoding
11,705 downloads per month
Used in 2 crates
4KB
strip_bom
Add a simple BOM striping feature for str
and String
.
Usage
use str_strip_bom::*;
// Or std::fs::read_to_string, surf::get, ...
let my_string: Vec<u8> = vec![ 0xefu8, 0xbb, 0xbf, 0xf0, 0x9f, 0x8d, 0xa3 ];
let my_string: String = String::from_utf8( my_string ).unwrap();
// In this time, my_string has the BOM => true ๐ฃ
println!( "{} {}", my_string.starts_with("\u{feff}"), &my_string );
// Strip BOM
let my_string: &str = my_string.strip_bom();
// my_string (slice) has not the BOM => false ๐ฃ
println!( "{} {}", my_string.starts_with("\u{feff}"), &my_string );
Motivation
- I author wanted a simple and lightweight BOM stripper for only
str
andString
, not for byte stream or the other of UTF-8 such as UTF-16 or UTF-32. - Because, for example,
serde
andserde_json
has no BOM supporting then it will be fail if I put a UTF-8 BOM source. - The rust standard,
str
andString
s will not support a BOM stripping features.; See also https://github.com/rust-lang/rfcs/issues/2428.
Reference
- https://tools.ietf.org/html/rfc3629; RFC3269 "UTF-8, a transformation format of ISO 10646" ยง6. Byte order mark (BOM)
License
Author
- USAGI.NETWORK / Usagi Ito https://usagi.network