#storage #metadata #back-end #simulation #command #simuldb

app simuldb-utils

Utility to extract information from simuldb databases

25 releases (9 breaking)

0.12.3 Dec 7, 2023
0.11.3 Dec 1, 2023
0.11.0 Nov 30, 2023

#620 in Database interfaces

Download history 1/week @ 2024-02-18 4/week @ 2024-02-25 334/week @ 2024-03-10 15/week @ 2024-03-17 230/week @ 2024-03-31

579 downloads per month

MIT/Apache

110KB
2.5K SLoC

simuldb

This repository contains both the simuldb crate as well as a command line tool sdutil for convenience.

Library

This library provides backend and format agnostic data storage for simulation results coupled with metadata about the used Software and the simulation Run

Data storage is not handled by the database, only associated metadata.

Currently two backends are included:

  • Json, which saves everything in JSON files
  • Neo4j, which uses a Neo4j database as backend (requires the neo4j feature to be enabled)

Custom backends can be implemented via the Database and DatabaseSession traits. Sessions are meant to associate a Datasets specific Run of a Software. Datasets are references to data stored in a file of any arbitrary format.

Minimal Example

This creates a Json based Database and writes some arbitraty data to it. Note that in order to create a session, usually the vergen_session macro will suffice.

use std::io::Write;
use serde::Serialize;
use simuldb::prelude::*;

// Define a metadata type
#[derive(Debug, Serialize)]
struct Metadata {
    a: usize,
    b: String,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create or open database
    let mut json = Json::new("output/json");

    // Start new session which will contain references to the datasets
    let software = Software::new("example", "1.0", "now");
    let run = Run::new("now");
    let session = Session::new(software, run);
    let mut json_session = json.add_session(session)?;

    // Create a directory for the result data
    std::fs::create_dir_all("output/data")?;

    // Generate some data and add it to the database
    for a in 0_usize..10 {
        // A DataWriter can be used to automatically calculate
        // the hash of a file and create a Dataset from it
        let mut writer = DatasetWriter::new("output/data")?;

        // Write some data to the output file
        writeln!(writer, "a^2 = {}", a.pow(2))?;

        // Generate metadata to associate with it
        let metadata = Metadata {
            a,
            b: "squaring".to_string(),
        };

        // Add the corresponding dataset to the database
        let dataset = writer.finalize(metadata)?;
        json_session.add_dataset(&dataset)?;
    }

    Ok(())
}

Python bindings

Python bindings can be generated by using maturin and are also published to PyPI. The documentation can be found here. For more detail see the Python Readme.

Command line utilities

Additionally some command line utilities are provided.

Installation

The utilities can be (un)installed using the provided make file. Install them with

make
sudo make install

or optionally into a prefix

make PREFIX=~/.local install

The same applies to uninstalling with make uninstall.

sdutil

Convenience utility to access database information. sdutil itself provides multiple subcommands, but is also implemented as a multicall binary, so symlinking it will provide the correspoding command separately.

Convenience utility to access database information

Usage: sdutil [OPTIONS] <COMMAND>

Commands:
  database  Connect to database
  transfer  Transfer data from one backend to another
  help      Print this message or the help of the given subcommand(s)

Options:
  -v, --verbose...
          Output verbosity
          
          Specify multiple times to increase verbosity

  -q, --quiet
          Disable all logging output

  -h, --help
          Print help (see a summary with '-h')

sdutil database / sddb

Connect to database

Usage: sddb [OPTIONS] <DB> <COMMAND>

Commands:
  list    List session in the database
  verify  Verify hashes of the data files
  help    Print this message or the help of the given subcommand(s)

Arguments:
  <DB>
          Database connection string
          
          Should be one of the following
          - json://PATH for a JSON based database
              PATH denotes the path to the folder containing the JSON files.
          - neo4j://USER:PASS@URI for a Neo4j based database
              URI denotes the connection URI (e.g. localhost:7687).
              USER and PASS are user and password for the connection. If not given, neo4j is used for either of them.
              This can be achieved with neo4j://URI or neo4j://USER@URI.

Options:
  -v, --verbose...
          Output verbosity
          
          Specify multiple times to increase verbosity

  -q, --quiet
          Disable all logging output

  -h, --help
          Print help (see a summary with '-h')

sdutil transfer / sdtransfer

Transfer data from one backend to another

Usage: sdtransfer [OPTIONS] <FROM> <TO>

Arguments:
  <FROM>
          Database to copy from
          
          Should be one of the following
          - json://PATH for a JSON based database
              PATH denotes the path to the folder containing the JSON files.
          - neo4j://USER:PASS@URI for a Neo4j based database
              URI denotes the connection URI (e.g. localhost:7687).
              USER and PASS are user and password for the connection. If not given, neo4j is used for either of them.
              This can be achieved with neo4j://URI or neo4j://USER@URI.

  <TO>
          Database to copy to
          
          See <FROM> for more details

Options:
  -v, --verbose...
          Output verbosity
          
          Specify multiple times to increase verbosity

  -q, --quiet
          Disable all logging output

  -h, --help
          Print help (see a summary with '-h')

Dependencies

~4–5.5MB
~96K SLoC