11 releases

0.3.2 Jul 21, 2022
0.3.1 Jul 21, 2022
0.3.0 Apr 29, 2022
0.2.0 Apr 27, 2022
0.1.9 Sep 26, 2021

#501 in Filesystem

Download history 3/week @ 2024-01-01 66/week @ 2024-01-08 157/week @ 2024-01-15 42/week @ 2024-01-22 82/week @ 2024-01-29 129/week @ 2024-02-05 21/week @ 2024-02-12 103/week @ 2024-02-19 41/week @ 2024-02-26 95/week @ 2024-03-04 59/week @ 2024-03-11 39/week @ 2024-03-18 127/week @ 2024-03-25

324 downloads per month
Used in passionfruit

MIT/Apache

7.5MB
1.5K SLoC

Contains (DOS exe, 2.5MB) files/2mib.exe, (DOS exe, 325KB) files/hello-world.exe, (JAR file, 60KB) files/hello.jar, (ELF lib, 15KB) files/test-so.so

bindet (binary file type detection)

Crates Pipeline MIT License

Fast file type detection. Read more here: documentation

Supported file types

  • Zip
  • Rar (Rar 4 and 5)
  • Tar
  • Png
  • Jpg
  • 7-zip
  • Opus
  • Vorbis
  • Mp3
  • Webp
  • Flac
  • Matroska (mkv, mka, mks, mk3d, webm)
  • Wasm
  • Java Class
  • Mach-O
  • Elf (Executable and Linkable Format)
  • Wav
  • Avi
  • Aiff
  • Tiff
  • Sqlite3 (.db)
  • Ico
  • Dalvik
  • Pdf
  • Gif
  • Xcf
  • Scala Tasty
  • Bmp
  • others are on the road

Example:

use std::fs::{OpenOptions};
use std::io::BufReader;
use std::io::ErrorKind;
use bindet;
use bindet::types::FileType;
use bindet::FileTypeMatch;
use bindet::FileTypeMatches;

fn example() {
    let file = OpenOptions::new().read(true).open("files/test.tar").unwrap();
    let buf = BufReader::new(file);

    let detect = bindet::detect(buf).map_err(|e| e.kind());
    let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
        vec![FileType::Tar],
        vec![FileTypeMatch::new(FileType::Tar, true)]
    )));

    assert_eq!(detect, expected);
}

False Positives

Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC (0x66 0x4C 0x61 0x43) and PDF uses %PDF- (0x25 0x50 0x44 0x46 0x2D), because of this, text files that starts with this sequence can be detected as a binary file.

bindet reports those file types with FileTypeMatch::full_match = false, a second step can take these types and validate the prediction by applying a better specification match, however, at the moment, this only happens for Zip files.

You can use crates like encoding_rs to determine whether a file is really binary or text.

Dependencies