#sha-1 #orc #dataset #graph #swh #digestmap #swhid

bin+lib swh-digestmap

A tool to quickly convert between content hashes (eg. SWHID <-> sha1)

1 unstable release

new 0.1.0 May 15, 2025

#1958 in Cryptography

GPL-3.0-or-later

27KB
436 lines

swh-digestmap

A tool to create a map from SWHIDs to SHA1, and a Python module to access this map.

Designed after the hash conversion service.

Build a digestmap

From an ORC-exported dataset, this will only use the content subfolder.

CONTENT_ORC=$HOME$/swh-environment/swh-graph/swh/graph/example_dataset/orc
cargo run --bin swh-digestmap-build --features=build -- --orc $CONTENT_ORC --dir-out dest_folder

Find a SHA1 from a SWHID

cargo run --bin swh-digestmap-map -- dest_folder --swhid swh:1:cnt:0000000000000000000000000000000000000004

Python binding

pip install setuptools-rust build
cd pyo3/
pip install .

or, package with pip wheel --no-deps . --wheel-dir dist

Python use:

from swh.digestmap import DigestMap
digestmap = DigestMap("dest_folder")
digestmap.sha1_from_swhid("swh:1:cnt:0000000000000000000000000000000000000004")

Dependencies

~25–63MB
~1M SLoC