13 releases (stable)

2.0.3 Nov 13, 2023
2.0.2 Feb 11, 2023
1.1.4 Feb 4, 2019
1.1.3 Oct 21, 2018
0.1.2 Oct 12, 2018

#126 in Parser implementations

Download history 108447/week @ 2024-07-30 104610/week @ 2024-08-06 105556/week @ 2024-08-13 103167/week @ 2024-08-20 100142/week @ 2024-08-27 117413/week @ 2024-09-03 115611/week @ 2024-09-10 106893/week @ 2024-09-17 112459/week @ 2024-09-24 114409/week @ 2024-10-01 109218/week @ 2024-10-08 121080/week @ 2024-10-15 110585/week @ 2024-10-22 115650/week @ 2024-10-29 118637/week @ 2024-11-05 104887/week @ 2024-11-12

468,840 downloads per month
Used in 282 crates (9 directly)

Apache-2.0

23KB
369 lines

unicode-bom

Build status Crate status Downloads License

Unicode byte-order mark detection for Rust projects.

What does it do?

unicode-bom will read the first few bytes from an array or a file on disk, then determine whether a byte-order mark is present.

What doesn't it do?

It won't check the rest of the data to determine whether it's actually valid according to the indicated encoding.

How do I install it?

Add it to your dependencies in Cargo.toml:

[dependencies]
unicode-bom = "2"

How do I use it?

For more detailed information see the API docs, but the general gist is as follows:

use unicode_bom::Bom;

// The BOM can be parsed from a file on disk via the `FromStr` trait...
let bom: Bom = "foo.txt".parse().unwrap();
match bom {
    Bom::Null => {
        // No BOM was detected
    }
    Bom::Bocu1 => {
        // BOCU-1 BOM was detected
    }
    Bom::Gb18030 => {
        // GB 18030 BOM was detected
    }
    Bom::Scsu => {
        // SCSU BOM was detected
    }
    Bom::UtfEbcdic => {
        // UTF-EBCDIC BOM was detected
    }
    Bom::Utf1 => {
        // UTF-1 BOM was detected
    }
    Bom::Utf7 => {
        // UTF-7 BOM was detected
    }
    Bom::Utf8 => {
        // UTF-8 BOM was detected
    }
    Bom::Utf16Be => {
        // UTF-16 (big-endian) BOM was detected
    }
    Bom::Utf16Le => {
        // UTF-16 (little-endian) BOM was detected
    }
    Bom::Utf32Be => {
        // UTF-32 (big-endian) BOM was detected
    }
    Bom::Utf32Le => {
        // UTF-32 (little-endian) BOM was detected
    }
}

// ...or you can detect the BOM in a byte array
let bytes = [0u8, 0u8, 0xfeu8, 0xffu8];
let bom = Bom::from(&bytes[0..]);
assert_eq!(bom, Bom::Utf32Be);
assert_eq(bom.len(), 4);

How do I set up the build environment?

If you don't already have Rust installed, get that first using rustup:

curl https://sh.rustup.rs -sSf | sh

Then you can build the project:

cargo b

And run the tests:

cargo t

Is there API documentation?

Yes.

Is there a change log?

Yes.

What license is it published under?

Apache-2.0.

No runtime deps