#storage #filesystem #data #transactional #security

acid-store

A library for secure, deduplicated, transactional, and verifiable data storage

10 releases (6 breaking)

0.7.1 Jun 13, 2021
0.7.0 Jun 13, 2021
0.6.0 Oct 26, 2020
0.5.0 Apr 30, 2020
0.1.1 Feb 9, 2020

#154 in Filesystem

Download history 5/week @ 2021-02-25 1/week @ 2021-03-11 6/week @ 2021-03-18 26/week @ 2021-03-25 3/week @ 2021-04-01 9/week @ 2021-04-08 18/week @ 2021-04-15 24/week @ 2021-04-22 3/week @ 2021-04-29 5/week @ 2021-05-06 20/week @ 2021-05-13 11/week @ 2021-05-20 6/week @ 2021-05-27 4/week @ 2021-06-03 47/week @ 2021-06-10

54 downloads per month

Apache-2.0 and maybe LGPL-2.1

365KB
5.5K SLoC

Tests codecov crates.io docs.rs

acid-store

acid-store is a library for secure, deduplicated, transactional, and verifiable data storage.

This library provides high-level abstractions for data storage over a number of storage backends. The goal is to decouple how you access your data from where you store it. You can access your data as an object store, a virtual file system, a persistent collection, or a content-addressable storage, regardless of where the data is stored. Out of the box, this library supports the local file system, SQLite, Redis, Amazon S3, SFTP, and many cloud providers as storage backends. Storage backends are easy to implement, and this library builds on top of them to provide features like encryption, compression, deduplication, locking, and atomic transactions.

For details and examples, see the documentation.

⚠️ This project is still immature and needs more testing. Testers are always appreciated, but please remember to back up your data! Also keep in mind that this code has not been audited for security. All the usual disclaimers apply.

Features

  • Optional encryption of all data and metadata using XChaCha20-Poly1305 and Argon2, powered by libsodium
  • Optional compression using LZ4
  • Optional content-based deduplication using the ZPAQ chunking algorithm
  • Supports packing data into fixed-size blocks to avoid metadata leakage
  • Integrity checking of data and metadata using checksums and (if encryption is enabled) AEAD
  • Transactional operations providing atomicity, consistency, isolation, and durability (ACID)
  • Copy-on-write semantics
  • New storage backends are easy to implement

Abstractions

This library provides the following abstractions for data storage.

  • An object store which maps keys to seekable binary blobs
  • A virtual file system which supports file metadata, special files, and importing and exporting files to the local OS file system
  • A persistent, heterogeneous, map-like collection
  • An object store with support for content versioning
  • A content-addressable storage which allows for accessing data by its cryptographic hash

Backends

This library provides the following storage backends out of the box.

  • Local file system directory
  • SQLite
  • Redis
  • Amazon S3
  • SFTP
  • Cloud storage via rclone
  • In-Memory

Benchmarks

The following results show read and write speeds for an in-memory repository with various configurations. An in-memory repository is used to make benchmark results more consistent between runs and between machines. You can run the benchmarks yourself by running cargo bench --all-features.

Specs

Spec Value
Processor Ryzen 5 1600x
Memory 32 GB (3200MHz)
OS Linux 5.8

Results

Chunking Packing Encryption Compression Read Write
Fixed None None None 2410 MiB/s 1360 MiB/s
ZPAQ None None None 2210 MiB/s 470 MiB/s
Fixed Fixed XChaCha20-Poly1305 None 805 MiB/s 565 MiB/s
ZPAQ Fixed XChaCha20-Poly1305 None 805 MiB/s 300 MiB/s

Dependencies

~5–14MB
~280K SLoC