25 releases
0.3.14 | Jun 20, 2023 |
---|---|
0.3.8 | May 29, 2023 |
0.3.5 | Mar 30, 2023 |
0.2.7 | Dec 30, 2022 |
0.1.0 | Jul 26, 2022 |
#6 in #sha256
145 downloads per month
125KB
2.5K
SLoC
blobnet
A configurable, low-latency blob storage server for content-addressed data.
See the API documentation for more information.
Installation
cargo install blobnet # install server CLI / binary
cargo add blobnet # add to your project
Authors
This library is created by the team behind Modal.
- Eric Zhang (@ekzhang1) – Modal
- Jonathon Belotti (@jonobelotti_IO) – Modal
lib.rs
:
Blobnet
A configurable, low-latency blob storage server for content-addressed data.
This acts as a non-volatile, over-the-network content cache. Clients can add binary blobs (fixed-size byte vectors) to the cache, and the data is indexed by its SHA-256 hash. Any blob can be retrieved given its hash and the range of bytes to read.
Data stored in blobnet is locally cached and durable.
Providers
The core of blobnet is the Provider
trait. This trait defines the
interface shared by all blobnet instances. It is used like so:
use std::io::Cursor;
use blobnet::ReadStream;
use blobnet::provider::{self, Provider};
// Create a new provider.
let provider = provider::Memory::new();
// Insert data, returning its hash.
let data: ReadStream = Box::pin(b"hello blobnet world!" as &[u8]);
let hash = provider.put(data).await?;
// Check if a blob exists and return its size.
let size = provider.head(&hash).await?;
assert_eq!(size, 20);
// Read the content as a binary stream.
provider.get(&hash, None).await?;
provider.get(&hash, Some((0, 10))).await?; // Requests the first 10 bytes.
You can combine these operations in any order, and they can run in parallel,
since they take shared &self
receivers. The semantics of each operation
should behave the same regardless of provider.
The Provider
trait is public, and several providers are offered,
supporting storage in a local directory, network file system, or in AWS S3.
Network Server
Blobnet allows you to run it as a server and send data over the network. This serves responses to blob operations over the HTTP/2 protocol. For example, you can run a blobnet server on a local machine with
export BLOBNET_SECRET=my-secret
blobnet --source localdir:/tmp/blobnet --port 7609
This specifies the provider using a string syntax for the --source
flag.
You can connect to the server as a provider in another process:
use blobnet::{client::FileClient, provider};
let client = FileClient::new_http("http://localhost:7609", "my-secret");
let provider = provider::Remote::new(client);
Why would you want to share a blobnet server over the network? One use case is for shared caches.
Caching
Blobnet supports two-tiered caching of data with the Cached
provider.
This breaks up files into chunks with a configurable page size, storing them
in a local cache directory and an in-memory page cache. By adding a cache in
non-volatile storage, we can speed up file operations by multiple orders of
magnitude compared to a network file system, such as:
use blobnet::provider;
// Create a new provider targeting a local NFS mount.
let provider = provider::LocalDir::new("/mnt/nfs");
/// Add a caching layer on top of the provider, with 2 MiB page size.
let provider = provider::Cached::new(provider, "/tmp/blobnet-cache", 1 << 21);
Caching is also useful for accessing remote blobnet servers. It composes well and can add more tiers to the dataflow, improving system efficiency and network load.
use blobnet::{client::FileClient, provider};
// Create a new provider fetching content over the network.
let client = FileClient::new_http("http://localhost:7609", "my-secret");
let provider = provider::Remote::new(client);
/// Add a caching layer on top of the provider, with 2 MiB page size.
let provider = provider::Cached::new(provider, "/tmp/blobnet-cache", 1 << 21);
Together these abstractions allow you to create a configurable, very low-latency content-addressed storage system.
Dependencies
~70MB
~1.5M SLoC