12 releases

new 0.5.1 Apr 24, 2024
0.5.0 Apr 24, 2024
0.3.9 Dec 4, 2023
0.3.8 Nov 17, 2023
0.3.0 Jul 11, 2023

#1028 in Concurrency

Download history 669/week @ 2024-01-06 901/week @ 2024-01-13 738/week @ 2024-01-20 787/week @ 2024-01-27 809/week @ 2024-02-03 563/week @ 2024-02-10 927/week @ 2024-02-17 1067/week @ 2024-02-24 943/week @ 2024-03-02 871/week @ 2024-03-09 865/week @ 2024-03-16 624/week @ 2024-03-23 730/week @ 2024-03-30 589/week @ 2024-04-06 887/week @ 2024-04-13 1056/week @ 2024-04-20

3,381 downloads per month
Used in 8 crates (4 directly)

MIT/Apache

295KB
7K SLoC

Rust 5.5K SLoC // 0.0% comments TypeScript 1.5K SLoC // 0.1% comments Scheme 26 SLoC JavaScript 4 SLoC

sea-streamer-file: File Backend

This is very similar to sea-streamer-stdio, but the difference is SeaStreamerStdio works in real-time, while sea-streamer-file works in real-time and replay. That means, SeaStreamerFile has the ability to traverse a .ss (sea-stream) file and seek/rewind to a particular timestamp/offset.

In addition, Stdio can only work with UTF-8 text data, while File is able to work with binary data. In Stdio, there is only one Streamer per process. In File, there can be multiple independent Streamers in the same process. Afterall, a Streamer is just a file.

The basic idea of SeaStreamerFile is like a tail -f with one message per line, with a custom message frame carrying binary payloads. The interesting part is, in SeaStreamer, we do not use delimiters to separate messages. This removes the overhead of encoding/decoding message payloads. But it added some complexity to the file format.

The SeaStreamerFile format is designed for efficient fast-forward and seeking. This is enabled by placing an array of Beacons at fixed interval in the file. A Beacon contains a summary of the streams, so it acts like an inplace index. It also allows readers to align with the message boundaries. To learn more about the file format, read src/format.rs.

On top of that, are the high-level SeaStreamer multi-producer, multi-consumer stream semantics, resembling the behaviour of other SeaStreamer backends. In particular, the load-balancing behaviour is same as Stdio, i.e. round-robin.

Decoder

We provide a small utility to decode .ss files:

cargo install sea-streamer-file --features=executables --bin ss-decode
 # local version
alias ss-decode='cargo run --package sea-streamer-file --features=executables --bin ss-decode'
ss-decode -- --file <file> --format <format>

Pro tip: pipe it to less for pagination

ss-decode --file mystream.ss | less

Example log format:

 # header
[2023-06-05T13:55:53.001 | hello | 1 | 0] message-1
 # beacon

Example ndjson format:

/* header */
{"header":{"stream_key":"hello","shard_id":0,"sequence":1,"timestamp":"2023-06-05T13:55:53.001"},"payload":"message-1"}
/* beacon */

There is also a Typescript implementation under sea-streamer-file-reader.

TODO

  1. Resumable: currently unimplemented. A potential implementation might be to commit into a local SQLite database.
  2. Sharding: currently it only streams to Shard ZERO.
  3. Verify: a utility program to verify and repair SeaStreamer binary file.

Dependencies

~6–22MB
~297K SLoC