#distributed #version-control

nightly pippin

A database for small objects sorted only via classification

1 unstable release

Uses old Rust 2015

0.1.0 May 26, 2016

#162 in #version-control

MPL-2.0 license

450KB
3.5K SLoC

Contains (ELF exe/lib, 780KB) build-script-build

Pippin

Pippin is a database inspired by distributed version control systems (notably git). Unlike git it is designed to store thousands to millions (or more) small objects in only a few dozen files. Unlike regular databases, it is designed with distributed synchronisation in mind and convenient access to objects of a single user-defined type. Pippin does not (currently) have a true index for searching its database, but does have partitioning to reduce searches to a smaller subset.

For more, see the documentation in src/lib.rs or take a look at the examples.

Change-log

Pippin 0.1.0

Pippin is 'alpha' status.

Partition-oriented usage (i.e. a single 'partition') should have all the basic features there and is ready for testing, but the API may change. Perhaps the biggest caveat is that every commit is written to a new file due to not yet working out how to safely extend files.

Repository-oriented usage is still far from ready.

What should work:

  • persistance of data within a single 'partition' via snapshots
  • storing changes via commit logs
  • reconstruction of state from snapshot + logs
  • auto-detecting latest state(s)
  • merging of multiple latest states (may require user-interaction)
  • checksumming & detecting corrupt stuff
  • recovery of some data when files are missing (though this needs more work)
  • file formats are mosly final except that headers will get extra data and object diffs

What is planned:

  • tracking mutliple partitions in a distributed manner via file headers
  • user-specified classifiers
  • (possibly) indexes of some kind
  • reclassification of objects as necessary
  • partially-automated division of "large" partitions via classifiers
  • object diffs (current commits include a full copy of all changed entries)
  • log file extension (currently a new file is used per commit to avoid data loss)

Doc

The doc directory contains some file-format documentation and various notes planning Pippin's development.

Tickets were originally stored in files. Several "tags" are still in use; where applicable these are mentioned in tickets and can be used to find relevant bits of code. All of these can be found with grep:

egrep -IR "#00[0-9]{2}" doc/ src/

Building, running, testing

Pippin uses Cargo. A few example commands:

cargo test
cargo build --release
cargo run --example pippincmd -- -h
cargo help run
cargo doc && open target/doc/pippin/index.html

Generated binaries can be found in the target directory.

Licence

Pippin is licenced under the Mozilla Public License, version 2.0. A copy of this licence can be found in the LICENSE-MPL2.txt file or obtained at http://mozilla.org/MPL/2.0/ .

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the MPL-2.0 license, shall be licensed as above, without any additional terms or conditions.

Dependencies

~9.5MB
~155K SLoC