14 releases (breaking)

new 0.34.0 Jan 10, 2025
0.33.0 Dec 24, 2024
0.32.0 Dec 11, 2024
0.31.0 Nov 29, 2024
0.2.0 Sep 6, 2023

#190 in Algorithms

Download history 170/week @ 2024-09-18 135/week @ 2024-09-25 62/week @ 2024-10-02 15/week @ 2024-10-09 206/week @ 2024-10-16 5/week @ 2024-10-23 146/week @ 2024-10-30 25/week @ 2024-11-06 127/week @ 2024-11-13 15/week @ 2024-11-20 156/week @ 2024-11-27 187/week @ 2024-12-04 194/week @ 2024-12-11 162/week @ 2024-12-18 100/week @ 2024-12-25 40/week @ 2025-01-01

513 downloads per month
Used in 2 crates

MIT/Apache

4.5MB
72K SLoC

DBSP

Database Stream Processor (DBSP) is a computational engine for continuous analysis of changing data. With DBSP, a programmer writes code in terms of computations on a complete data set, but DBSP implements it incrementally, meaning that changes to the data set run in time proportional to the size of the change rather than the size of the data set. This is a major advantage for applications that work with large data sets that change frequently in small ways.

The DBSP computational engine is a component of the Feldera Continuous Analytics Platform.

Resources

Learning Materials

Documentation

Feedback

Some ways that you can give us your feedback:

Example

use dbsp::{operator::FilterMap, IndexedZSet, OrdIndexedZSet, OutputHandle, Runtime};

type Node = usize;
type Weight = isize;

// Creates a series of graph edges, then counts the number of times each source
// node appears, then counts the number of times each count appears.
fn main() {
    let edges = 100; // Number of initial edges in the graph.
    let sources = 13; // Number of source nodes in the graph.
    let extra = 5; // Number of extra edges added later to the graph.
    let threads = 2; // Number of threads.

    let (mut dbsp, (hedges, degrees, distribution)) = Runtime::init_circuit(threads, |circuit| {
        let (edges, hedges) = circuit.add_input_zset::<(Node, Node), Weight>();

        // Count the number of edges with each node as its source (each node's
        // out-degree).
        let degrees = edges.map(|(src, _dst)| *src).weighted_count();

        // Count the number of nodes with each out-degree.
        let distribution = degrees.map(|(_src, count)| *count).weighted_count();

        Ok((hedges, degrees.output(), distribution.output()))
    })
        .unwrap();

    // Add some initial edges and print the results.
    for i in 0..edges {
        hedges.push((i % sources, i % 7), 1);
    }
    dbsp.step().unwrap();
    println!("Initialization:");
    print_changes(&degrees, &distribution);

    // Add a few more nodes and print the changes.
    for i in 0..extra {
        hedges.push((i % sources, i % 9), 1);
    }
    dbsp.step().unwrap();
    println!("Changes:");
    print_changes(&degrees, &distribution);

    dbsp.kill().unwrap();
}

fn print_changes(
    degrees: &OutputHandle<OrdIndexedZSet<Node, isize, Weight>>,
    distribution: &OutputHandle<OrdIndexedZSet<isize, isize, Weight>>,
) {
    for (src, outdegree, weight) in degrees.consolidate().iter() {
        println!("    {weight:+}: Node {src} has out-degree {outdegree}");
    }
    for (outdegree, count, weight) in distribution.consolidate().iter() {
        println!("    {weight:+}: {count} nodes have out-degree {outdegree}");
    }
}

Dependencies

~39–54MB
~1M SLoC