#binary-data #string #extract

bin+lib rust-strings

rust-strings is a library to extract ascii strings from binary data

5 releases (breaking)

0.6.0 Feb 4, 2024
0.4.0 Jan 7, 2023
0.3.0 May 28, 2022
0.2.0 Feb 25, 2022
0.1.0 Jan 30, 2022

#1354 in Encoding

Download history 105/week @ 2024-08-09 73/week @ 2024-08-16 122/week @ 2024-08-23 123/week @ 2024-08-30 894/week @ 2024-09-06 621/week @ 2024-09-13 310/week @ 2024-09-20 771/week @ 2024-09-27 870/week @ 2024-10-04 656/week @ 2024-10-11 425/week @ 2024-10-18 289/week @ 2024-10-25 210/week @ 2024-11-01 246/week @ 2024-11-08 331/week @ 2024-11-15 204/week @ 2024-11-22

1,020 downloads per month
Used in 3 crates

MIT license

34KB
741 lines

rust-strings

CI License Crates.io PyPI

rust-strings is a Rust library for extracting strings from binary data.
It also have Python bindings.

Installation

Python

Use the package manager pip to install rust-strings.

pip install rust-strings

Rust

rust-strings is available on crates.io and can be included in your Cargo enabled project like this:

[dependencies]
rust-strings = "0.6.0"

Usage

Python

import rust_strings

# Get all ascii strings from file with minimun length of string
rust_strings.strings(file_path="/bin/ls", min_length=3)
# [('ELF', 1),
#  ('/lib64/ld-linux-x86-64.so.2', 680),
#  ('GNU', 720),
#  ('.<O', 725),
#  ('GNU', 756),
# ...]

# You can also set buffer size when reading from file (default is 1mb)
rust_strings.strings(file_path="/bin/ls", min_length=5, buffer_size=1024)

# You can set encoding if you need (default is 'ascii', options are 'utf-16le', 'utf-16be')
rust_strings.strings(file_path=r"C:\Windows\notepad.exe", min_length=5, encodings=["utf-16le"])

# You can set multiple encoding
rust_strings.strings(file_path=r"C:\Windows\notepad.exe", min_length=5, encodings=["ascii", "utf-16le"])

# You can also pass bytes instead of file_path
rust_strings.strings(bytes=b"test\x00\x00", min_length=4, encodings=["ascii"])
# [("test", 0)]

# You can also dump to json file
rust_strings.dump_strings("strings.json", bytes=b"test\x00\x00", min_length=4, encodings=["ascii"])
# `strings.json` content:
# [["test", 0]]

Rust

Full documentation available in docs.rs

use rust_strings::{FileConfig, BytesConfig, strings, dump_strings, Encoding};
use std::path::{Path, PathBuf};

let config = FileConfig::new(Path::new("/bin/ls")).with_min_length(5);
let extracted_strings = strings(&config);

// Extract utf16le strings
let config = FileConfig::new(Path::new("C:\\Windows\\notepad.exe"))
    .with_min_length(15)
    .with_encoding(Encoding::UTF16LE);
let extracted_strings = strings(&config);

// Extract ascii and utf16le strings
let config = FileConfig::new(Path::new("C:\\Windows\\notepad.exe"))
    .with_min_length(15)
    .with_encoding(Encoding::ASCII)
    .with_encoding(Encoding::UTF16LE);
let extracted_strings = strings(&config);

let config = BytesConfig::new(b"test\x00".to_vec());
let extracted_strings = strings(&config);
assert_eq!(vec![(String::from("test"), 0)], extracted_strings.unwrap());

// Dump strings into `strings.json` file.
let config = BytesConfig::new(b"test\x00".to_vec());
dump_strings(&config, PathBuf::from("strings.json"));

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Dependencies

~0–5.5MB
~20K SLoC