1 unstable release
0.1.0 | Aug 18, 2021 |
---|
#2305 in Database interfaces
Used in assemblage_view
220KB
3.5K
SLoC
Distributed Document/Graph DB for Connected Pages
AssemblageDB is a transactional high-level database for connected webs of pages, notes, texts and other media. Think of it like a personal web, but easily editable, with more connections and better navigation than the web. It is high-level in the sense that it defines a document model similar to HTML but vastly simpler and with graph-like 2-way links instead of tree-like 1-way jump links. The data model is both:
- document-oriented: supports nested documents without a fixed schema
- graph-based: documents can have multiple parents and form a directed, possibly cyclic graph
Features
- versioned: old values remain accessible until merged
- transactional: snapshots are isolated through MVCC
- storage-agnostic: supports native (files) and wasm (IndexedDB) targets
- indexed: maintains an automatic index for similarity/overlap search
- distributed: nodes can be published/subscribed as remote broadcasts
Data Model
Nodes in an AssemblageDB can be either atomic (a line of text for example) or
nested, either in a list containing multiple children or a styled node
containing just a single child. List
nodes have a layout, which controls how
children are laid out in relation to each other, while Styled
nodes have zero
or more block styles or span styles, that control how their (possibly
nested) child is displayed. Examples for layouts and styles are:
Layout::Chain
: lays out children as a consecutive chain of inline spans. With 2 text children "foo" and "bar", the chain would be displayed as "foobar".Layout::Page
: lays out children as blocks on a page, separated vertically by a new line. With 2 text children "foo" and "bar", the page would be displayed as 2 lines, the first line containing "foo", the second line containing "bar".SpanStyle::Italic
: A span (inline) style that would display the child "foo" as "foo"BlockStyle::Heading
: A block style that would display the child "foo" in its own block with a larger font size.
A node is always either a span or a block. Text nodes are considered to be
spans by default and remain spans if styled using span styles such as
SpanStyle::Italic
or SpanStyle::Bold
. However, a single block style (such as
BlockStyle::Heading
) in a set of styles is always "contagious" and turns a
text node "foo" styled with both SpanStyle::Italic
and BlockStyle::Heading
into a block. Similarly, layouts control whether a list is displayed as a span
or a block: Layout::Chain
turns a list into a span, while Layout::Page
turns
a list into a sequence of blocks.
Obligatory Warning
AssemblageDB is still in an experimental stage. It should go without saying that it is not a battle-tested production-ready database and could at any time eat all of your data. If you need to persist production data, use a real database instead.
Example
let storage = FileStorage::open("db1").await?;
let db = Db::open(storage).await?;
// Nodes support layouts and styles, for example as a page of blocks...
let page1_id = tx!(|db| {
db.add(Node::list(
Layout::Page,
vec![
Node::styled(BlockStyle::Heading, Node::text("A Heading!")),
Node::text("This is the first paragraph."),
Node::text("Unsurprisingly this is the second one..."),
],
))
.await?
});
// ...or as inline spans that are chained together:
let page2_id = tx!(|db| {
db.add(Node::list(
Layout::Page,
vec![Node::list(
Layout::Chain,
vec![
Node::text("And this is the "),
Node::styled(SpanStyle::Italic, Node::text("last")),
Node::text(" paragraph..."),
],
)],
))
.await?
});
// Documents can form a graph, with nodes keeping track of all parents:
tx!(|db| {
db.add(Node::list(Layout::Page, vec![page1_id, page1_id])).await?;
assert_eq!(db.parents(page1_id).await?.len(), 2);
assert_eq!(db.parents(page2_id).await?.len(), 0);
});
// All text is indexed, the DB supports "overlap" similarity search:
tx!(|db| {
let paragraph1_id = db.get(page1_id).await?.unwrap().children()[1].id()?;
let paragraph3_id = db.get(page2_id).await?.unwrap().children()[0].id()?;
let overlaps_of_p1 = db.overlaps(paragraph1_id).await?;
assert_eq!(overlaps_of_p1.len(), 1);
assert_eq!(overlaps_of_p1[0].id, paragraph3_id);
assert!(overlaps_of_p1[0].score() > 0.5);
});
// Nodes (with all their descendants) can be published globally...
let broadcast = tx!(|db| { db.publish_broadcast(page1_id).await? });
// ...and the broadcast can then be fetched in another remote DB:
let other_storage = FileStorage::open("db2").await?;
let db2 = Db::open(other_storage).await?;
tx!(|db2| {
let paragraph_id = db2
.add(Node::text("This is the first paragraph, right?"))
.await?;
db2.add(Node::list(Layout::Page, vec![paragraph_id]))
.await?;
// The DB is empty except for this paragraph, so no overlaps:
assert_eq!(db2.overlaps(paragraph_id).await?.len(), 0);
// But when the broadcast paragraph is fetched, there is an overlap:
db2.fetch_broadcast(&broadcast.broadcast_id).await?;
assert_eq!(db2.overlaps(paragraph_id).await?.len(), 1);
});
Dependencies
~3–17MB
~218K SLoC