11 releases
0.3.2 | Jul 21, 2022 |
---|---|
0.3.1 | Jul 21, 2022 |
0.3.0 | Apr 29, 2022 |
0.2.0 | Apr 27, 2022 |
0.1.9 | Sep 26, 2021 |
#705 in Filesystem
137 downloads per month
Used in passionfruit
7.5MB
1.5K
SLoC
Contains (DOS exe, 2.5MB) files/2mib.exe, (DOS exe, 325KB) files/hello-world.exe, (JAR file, 60KB) files/hello.jar, (ELF lib, 15KB) files/test-so.so
bindet (binary file type detection)
Fast file type detection. Read more here: documentation
Supported file types
- Zip
- Rar (Rar 4 and 5)
- Tar
- Png
- Jpg
- 7-zip
- Opus
- Vorbis
- Mp3
- Webp
- Flac
- Matroska (mkv, mka, mks, mk3d, webm)
- Wasm
- Java Class
- Mach-O
- Elf (Executable and Linkable Format)
- Wav
- Avi
- Aiff
- Tiff
- Sqlite3 (
.db
) - Ico
- Dalvik
- Gif
- Xcf
- Scala Tasty
- Bmp
- others are on the road
Example:
use std::fs::{OpenOptions};
use std::io::BufReader;
use std::io::ErrorKind;
use bindet;
use bindet::types::FileType;
use bindet::FileTypeMatch;
use bindet::FileTypeMatches;
fn example() {
let file = OpenOptions::new().read(true).open("files/test.tar").unwrap();
let buf = BufReader::new(file);
let detect = bindet::detect(buf).map_err(|e| e.kind());
let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
vec![FileType::Tar],
vec![FileTypeMatch::new(FileType::Tar, true)]
)));
assert_eq!(detect, expected);
}
False Positives
Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC
(0x66 0x4C 0x61 0x43
)
and PDF uses %PDF-
(0x25 0x50 0x44 0x46 0x2D
), because of this, text files that starts with this sequence can be detected as a binary file.
bindet reports those file types with FileTypeMatch::full_match = false
, a second step can take these types and validate
the prediction by applying a better specification match, however, at the moment, this only happens for Zip
files.
You can use crates like encoding_rs to determine whether a file is really binary or text.