14 releases (breaking)
new 0.34.0 | Jan 10, 2025 |
---|---|
0.33.0 | Dec 24, 2024 |
0.32.0 | Dec 11, 2024 |
0.31.0 | Nov 29, 2024 |
0.2.0 | Sep 6, 2023 |
#190 in Algorithms
513 downloads per month
Used in 2 crates
4.5MB
72K
SLoC
DBSP
Database Stream Processor (DBSP) is a computational engine for continuous analysis of changing data. With DBSP, a programmer writes code in terms of computations on a complete data set, but DBSP implements it incrementally, meaning that changes to the data set run in time proportional to the size of the change rather than the size of the data set. This is a major advantage for applications that work with large data sets that change frequently in small ways.
The DBSP computational engine is a component of the Feldera Continuous Analytics Platform.
Resources
Learning Materials
-
The tutorial walks through a series of simple examples.
-
The
circuit
module documentation provides a catalog of computational elements. -
More sophisticated uses can be found as examples and benchmarks
Documentation
Feedback
Some ways that you can give us your feedback:
-
Join our community Slack.
-
Submit a pull request. DBSP uses the Developer Certificate of Origin (DCO) (see the contribution guidelines).
Example
use dbsp::{operator::FilterMap, IndexedZSet, OrdIndexedZSet, OutputHandle, Runtime};
type Node = usize;
type Weight = isize;
// Creates a series of graph edges, then counts the number of times each source
// node appears, then counts the number of times each count appears.
fn main() {
let edges = 100; // Number of initial edges in the graph.
let sources = 13; // Number of source nodes in the graph.
let extra = 5; // Number of extra edges added later to the graph.
let threads = 2; // Number of threads.
let (mut dbsp, (hedges, degrees, distribution)) = Runtime::init_circuit(threads, |circuit| {
let (edges, hedges) = circuit.add_input_zset::<(Node, Node), Weight>();
// Count the number of edges with each node as its source (each node's
// out-degree).
let degrees = edges.map(|(src, _dst)| *src).weighted_count();
// Count the number of nodes with each out-degree.
let distribution = degrees.map(|(_src, count)| *count).weighted_count();
Ok((hedges, degrees.output(), distribution.output()))
})
.unwrap();
// Add some initial edges and print the results.
for i in 0..edges {
hedges.push((i % sources, i % 7), 1);
}
dbsp.step().unwrap();
println!("Initialization:");
print_changes(°rees, &distribution);
// Add a few more nodes and print the changes.
for i in 0..extra {
hedges.push((i % sources, i % 9), 1);
}
dbsp.step().unwrap();
println!("Changes:");
print_changes(°rees, &distribution);
dbsp.kill().unwrap();
}
fn print_changes(
degrees: &OutputHandle<OrdIndexedZSet<Node, isize, Weight>>,
distribution: &OutputHandle<OrdIndexedZSet<isize, isize, Weight>>,
) {
for (src, outdegree, weight) in degrees.consolidate().iter() {
println!(" {weight:+}: Node {src} has out-degree {outdegree}");
}
for (outdegree, count, weight) in distribution.consolidate().iter() {
println!(" {weight:+}: {count} nodes have out-degree {outdegree}");
}
}
Dependencies
~39–54MB
~1M SLoC