#file-search #file #search #database #cli

nightly app updatehashdb

Update an index of the hashes of all files

1 unstable release

0.1.0 Jun 27, 2023

#1463 in Filesystem

43 downloads per month

GPL-3.0-or-later

40KB
647 lines

hashfindutils

The hashfindutils provide a hash oriented version of findutils' locate(1) and updatedb(1) (which themselves are a cached version of findutils' find(1)). With updatehashdb a database of the hashes of all files are created (currently the hash is SHA-256, but others will be added) and then with hashfind a hash can be converted to it's path (libhashfind provides a library interface).

Installation

To install it you need Rust. The usual recommended way to install it is with Rustup, but you can also use your distributions package manager's version (for Debian: apt install rustc cargo).

The rest of the installation process depends on whether you want to install it manually or with cargo.

Manual instalation

First you have to download hashfindutils' source code:

cd /your/place/for/downloaded/software/source/code
git clone https://codeberg.org/zvavybir/hashfindutils.git
cd hashfindutils

After this, compile it:

cargo build
# cargo build --release # If you want to activate optimizations use this instead,
                        # but I do not recommend it currently.

Now you have to decide whether you want it to run globally or only for your user.

If you want to have it globally copy the programs to the global directory (replace "debug" with "release" if you want to activate optimizations):

cp target/debug/updatehashdb /usr/bin/updatehashdb
cp target/debug/hashfind /usr/bin/hashfind

otherwise copy it to you user's bin directory:

cp target/debug/updatehashdb ~/.local/bin/updatehashdb
cp target/debug/hashfind ~/.local/bin/hashfind

Installation with cargo

To install with cargo, run the following commands as the user you want to install it as (root for global, your user for local):

cargo install updatehashdb
cargo install hashfind

Configuring

The configuration file is /etc/hashfindutils (global) or ~/.zvavybir/hashfindutils/config (local). An example configuration file would be (Warning: Read the security notices and recommended exclude paths before using this):

db_path=/usr/share/hashfindutils/db
search_path=/
exclude_path=/dev
exclude_path=/proc
exclude_path=/sys

This causes the database to be written in /usr/share/hashfindutils/db (this option could be left away since it's the default anyway) and indexes all files except those in /dev, /proc and /sys.

The possible options are:

  • db_path: Path to the database (must occur at most once; default value is /usr/share/hashfindutils/db for root and ~/.zvavybir/hashfindutils/db for non-root users.)
  • search_path: The directories which are indexed (can occur multiple times)
  • exclude_path: The directories/files not indexed (can – and should, see the security notices and recommended exclude paths – occur multiple times)
  • no_global: When this option is true hashfind doesn't use the global database if run as non-root user (not a security feature!; default value is false).

Generating the database

To use the hashfind program you first need to generate the database for caching (otherwise this program would be unusable slow). You can run updatehashdb manually or make a cron job so that it updates it always automatically (Warning: Read the security notices and recommended excluded paths before running this):

updatehashdb

or

crontab -e

and then append

@reboot updatehashdb

Usage

Now you can use hashfind [HASH] to search for a hash, e.g.:

hashfind 8663bab6d124806b9727f89bb4ab9db4cbcc3862f6bbf22024dfa7212aa4ab7d

Security notices (read before running as root)

There is no mechanism that only users who can read a file can look up it's hash, therefore malicious users can look up if a (and what) file has a known content. Due to the one-way-ness of hash functions this is not always a problem, but if a secret is already in the path it will be exposed or if the unknown part is small it can be brute forced.

A workaround for this is to run updatehashdb not as root, but as an unprivileged user which has no access to secret files (secret here defined as secret for some user on your system, which means that if you don't have other users on your system and you trust all software to be not malicious and bug free this is not a concern). The problem with that is that then even privileged users can't access the hashes for the secret files.

To be able to run updatehashdb as a non-root user you have to create it's global data directory (replace unprivilegeduser with the user you want to use) as root:

mkdir /usr/share/hashfindutils
chown unprivilegeduser /usr/share/hashfindutils

The default recommended behavior for root is to index all files under /. This is a problem since it's not a good (or possible) idea to index all files. Especially files under /dev (which has file representations of all your hardware, which can be very big and sometimes - as in case for input devices - actually even infinite) and /proc and /sys (which are virtual file systems that exposes some kernel functionality in form of files and some files - like /proc/dmesg - are also infinite).

To fix this you have to use the exclude_path feature of the configuration file (read there for more information).

Contributing

Any kind of contribution is very welcome! If you have a idea, found a bug or typo, or something else feel free to fill an issue or make an PR (but please remember that all code has to be licensed under GPLv3+ for it to be included).

Dependencies

~11–22MB
~407K SLoC