13 releases (stable)

2.0.3 Nov 13, 2023
2.0.2 Feb 11, 2023
1.1.4 Feb 4, 2019
1.1.3 Oct 21, 2018
0.1.2 Oct 12, 2018

#98 in Parser implementations

Download history 105464/week @ 2023-12-14 84274/week @ 2023-12-21 89997/week @ 2023-12-28 101839/week @ 2024-01-04 104999/week @ 2024-01-11 112905/week @ 2024-01-18 118844/week @ 2024-01-25 129883/week @ 2024-02-01 131604/week @ 2024-02-08 124573/week @ 2024-02-15 126783/week @ 2024-02-22 136955/week @ 2024-02-29 133458/week @ 2024-03-07 141133/week @ 2024-03-14 146754/week @ 2024-03-21 115052/week @ 2024-03-28

559,635 downloads per month
Used in 208 crates (9 directly)

Apache-2.0

23KB
369 lines

unicode-bom

Build status Crate status Downloads License

Unicode byte-order mark detection for Rust projects.

What does it do?

unicode-bom will read the first few bytes from an array or a file on disk, then determine whether a byte-order mark is present.

What doesn't it do?

It won't check the rest of the data to determine whether it's actually valid according to the indicated encoding.

How do I install it?

Add it to your dependencies in Cargo.toml:

[dependencies]
unicode-bom = "2"

How do I use it?

For more detailed information see the API docs, but the general gist is as follows:

use unicode_bom::Bom;

// The BOM can be parsed from a file on disk via the `FromStr` trait...
let bom: Bom = "foo.txt".parse().unwrap();
match bom {
    Bom::Null => {
        // No BOM was detected
    }
    Bom::Bocu1 => {
        // BOCU-1 BOM was detected
    }
    Bom::Gb18030 => {
        // GB 18030 BOM was detected
    }
    Bom::Scsu => {
        // SCSU BOM was detected
    }
    Bom::UtfEbcdic => {
        // UTF-EBCDIC BOM was detected
    }
    Bom::Utf1 => {
        // UTF-1 BOM was detected
    }
    Bom::Utf7 => {
        // UTF-7 BOM was detected
    }
    Bom::Utf8 => {
        // UTF-8 BOM was detected
    }
    Bom::Utf16Be => {
        // UTF-16 (big-endian) BOM was detected
    }
    Bom::Utf16Le => {
        // UTF-16 (little-endian) BOM was detected
    }
    Bom::Utf32Be => {
        // UTF-32 (big-endian) BOM was detected
    }
    Bom::Utf32Le => {
        // UTF-32 (little-endian) BOM was detected
    }
}

// ...or you can detect the BOM in a byte array
let bytes = [0u8, 0u8, 0xfeu8, 0xffu8];
let bom = Bom::from(&bytes[0..]);
assert_eq!(bom, Bom::Utf32Be);
assert_eq(bom.len(), 4);

How do I set up the build environment?

If you don't already have Rust installed, get that first using rustup:

curl https://sh.rustup.rs -sSf | sh

Then you can build the project:

cargo b

And run the tests:

cargo t

Is there API documentation?

Yes.

Is there a change log?

Yes.

What license is it published under?

Apache-2.0.

No runtime deps