#storage #filesystem #data #transactional #security

acid-store

A library for secure, deduplicated, transactional, and verifiable data storage

16 releases (10 breaking)

new 0.11.1 Nov 23, 2021
0.10.0 Aug 7, 2021
0.9.0 Jul 31, 2021
0.6.0 Oct 26, 2020
0.3.0 Feb 29, 2020

#132 in Filesystem

Download history 7/week @ 2021-08-09 35/week @ 2021-08-16 14/week @ 2021-08-23 6/week @ 2021-08-30 12/week @ 2021-09-06 15/week @ 2021-09-13 6/week @ 2021-09-20 35/week @ 2021-09-27 22/week @ 2021-10-04 5/week @ 2021-10-11 22/week @ 2021-10-18 10/week @ 2021-10-25 3/week @ 2021-11-01 16/week @ 2021-11-08 10/week @ 2021-11-15 29/week @ 2021-11-22

60 downloads per month
Used in aiml_ported

Apache-2.0

555KB
9K SLoC

Tests codecov crates.io docs.rs

acid-store

acid-store is a library for secure, deduplicated, transactional, and verifiable data storage.

This library provides high-level abstractions for data storage over a number of storage backends. The goal is to decouple how you access your data from where you store it. You can access your data as an object store, a virtual file system, a persistent collection, or a content-addressable storage, regardless of where the data is stored. Out of the box, this library supports the local file system, SQLite, Redis, Amazon S3, SFTP, and many cloud providers as storage backends. Storage backends are easy to implement, and this library builds on top of them to provide features like encryption, compression, deduplication, locking, and atomic transactions.

For details and examples, see the documentation.

⚠️ This project is still experimental; it experiences frequent breaking API changes and requires more testing. This project is not ready for use in production environments. Testers are always appreciated, but please remember to back up your data! Also keep in mind that this code has not been audited for security.

Features

  • Optional encryption of all data and metadata using XChaCha20-Poly1305 and Argon2, powered by libsodium
  • Optional compression using LZ4
  • Optional content-based deduplication using the ZPAQ chunking algorithm
  • Supports packing data into fixed-size blocks to avoid metadata leakage
  • Integrity checking of data and metadata using checksums and (if encryption is enabled) AEAD
  • Transactional operations providing atomicity, consistency, isolation, and durability (ACID)
  • Two-phase locking protects against concurrent access from multiple clients
  • Copy-on-write semantics
  • New storage backends are easy to implement

Abstractions

This library provides the following abstractions for data storage.

  • An object store which maps keys to seekable binary blobs
  • A virtual file system which supports file metadata, special files, sparse files, hard links, importing and exporting files to the local OS file system, and being mounted via FUSE
  • A persistent, heterogeneous, map-like collection
  • An object store with support for content versioning
  • A content-addressable storage which allows for accessing data by its cryptographic hash

Backends

This library provides the following storage backends out of the box.

  • Local file system directory
  • SQLite
  • Redis
  • Amazon S3
  • SFTP
  • Cloud storage via rclone
  • In-Memory

Benchmarks

The following results show read and write speeds for an in-memory repository with various configurations. An in-memory repository is used to make benchmark results more consistent between runs and between machines. You can run the benchmarks yourself by running cargo bench --all-features.

Specs

Spec Value
Processor Ryzen 5 1600x
Memory 32 GB (3200MHz)
OS Linux 5.11

Results

Chunking Packing Encryption Compression Read Write
Fixed None None None 6090 MiB/s 1920 MiB/s
ZPAQ None None None 2670 MiB/s 520 MiB/s
Fixed Fixed XChaCha20-Poly1305 None 870 MiB/s 610 MiB/s
ZPAQ Fixed XChaCha20-Poly1305 None 840 MiB/s 300 MiB/s

Dependencies

~3–14MB
~287K SLoC

?a