29 releases (breaking)

Uses new Rust 2021

0.90.0 Nov 25, 2022
0.70.0 Sep 30, 2022
0.60.0 May 24, 2022
0.44.0 Feb 21, 2022
0.1.0 Jan 29, 2018

#5 in #data-flow

Download history 10/week @ 2022-08-13 55/week @ 2022-08-20 4/week @ 2022-08-27 13/week @ 2022-09-03 88/week @ 2022-09-10 6/week @ 2022-09-17 22/week @ 2022-09-24 36/week @ 2022-10-01 12/week @ 2022-10-08 25/week @ 2022-10-15 13/week @ 2022-10-22 38/week @ 2022-10-29 35/week @ 2022-11-05 13/week @ 2022-11-12 32/week @ 2022-11-19 20/week @ 2022-11-26

130 downloads per month
Used in flowide

MIT license

435KB
9K SLoC

Build Status codecov Generic badge Generic badge Generic badge License: MIT

Welcome!

Welcome to flow, for defining, compiling and running parallel, dataflow programs like this one, that is a visual representation (generated by the compiler from the flow definition and rendered with graphviz) of the flow program to generate a sequence of fibonacci numbers.

If you are a programmer, your intuition will probably tell you a lot already about how flow works without any explanation. First flow

This is one of many samples that can be found in the flow samples crate in flow, and the first thing I got working (much to my own delight!).

What is a dataflow program?

A data flow program consists of a (hierarchical in this case, as a process within it can be another graph of processes, and so on down) graph of processes that act on data that flow between them on defined connections.

  • it is declarative and defines what processes are used, and how they are connected
  • processes are small and single purpose and "pure". They get a series of inputs, execute an algorithm (probably written in some procedural language) and produce an output.
  • The application used to run one (a "flow runner" in this case) provides ways for it to interact with the execution environment via "impure" functions, for things like STDIO, writing to a File System, etc.

What characteristics do they have?

Why is writing a dataflow program something interesting to explore in the first place?

Well, data flow programs define the program in terms of the processing steps that needs to be done on data and the dependencies between the data, making them inherently parallelizable and distributable (and in my mind, kind of the minimal essence or expression of the algorithm).

Processes only run on data when it is available, making them "event driven" (where the "event" is the availability of data...or alternatively, the data expresses an event that needs processing done on it and some output created). They are not focussed so much on the procedural steps that need to be done and the control flow of the same, but on the required transformations to the data and on data flow through the program.

What does flow include?

Currently, flows are defined declaratively in a text file (toml, json or yaml supported) that is then compiled to a flow manifest, which is executed. See Are we GUI yet? below for more on that.

The flow project includes:

  • Compiler: a library and a binary (flowclib and flowc) for compiling flows
  • Runner: a library and two binaries (flowrlib, flowr and flowrex) for running flows, including a command line debugger for debugging flows.
  • Standard Library: flowstdlib library of pre-defined flows and functions that can be re-used in flows
  • Samples: A set of sample flows to illustrate flow programming (more to come!)
  • Docs: Extensive documentation in the book documentation on defining flows, the runtime semantics, a programmers guide, docs on tool command line options and how to use them, the flowstdlib library functions and flows, flowr's context functions and more. The guide, including linked Code docs and rust "doc tests" are all published together online here.
  • How to build flow locally and contribute to it
  • Internal design and how some things are implemented

What made me want to do it?

You can read more about what made me want to do this project, based on ideas gathered over a few decades on and off (combined with looking for a "real" project to use to learn rust!) in the book's Inspirations for flow section. The core reason is: I wanted to know if I could, having stopped being a Software Engineer many years ago, based on rough ideas and intuition I had in my head (no real formal knowledge in this area or reading of books and papers - that came later after I did it).

I worked on the programming "semantics" as I implemented the samples. It's been a journey of discovery: of writing something like this (for me), learning rust in the process and learning how such a programming paradigm could work. I learned it could work, but requires a change in how you think about programming (with procedural programming so ingrained in us). Sometimes I struggled to think about relatively simple algorithms in a completely new way (reminds me of when I got stuck trying to write a loop, in Prolog, in University...you're thinking about it wrong...).

Building flow

For more details on how to build flow locally and contribute to it, please see building flow

Running your first 'flow'

Run the 'fibonacci' sample flow using:

flowc flowsamples/fibonacci (after running make which installs the flowc and flowr binaries for you)

or

cargo run -p flowc -- flowsamples/fibonacci if not

You should get a fibonacci series of numbers output to the terminal.

The first flow section of the guide walks you through it.

Are we GUI yet?

Data-flow programming, decoratively defining a graph of processes (nodes) and connections (edges), fits naturally with visualization of the graph (not the current text format). The ability to define, view execution and debug them with a visual tool would be great! This tool could avoid the "hard work" of writing flow definition text files, just producing the flow definition files formats supported by the flowc compiler. I have ideas for an IDE and experimented a little, but that remains one big chunk of work I'd like to work on at some point.

What's next?

I generate ideas for ways to improve the project faster than I can implement things in my spare time, so over time I accumulated many many issues in Github, and had to organize them into a github project with columns and to attack them kanban-style, to stop me going mad. I still have plenty left and continue to generate new ones all the time.

Probably the most important ones for external observers will be ones related to producing a GUI to make it more approachable, adding new context functions to allow integrations with the wbe and other systems being used, and providing more compelling samples closer to "real world problems"

Feedback and/or Encouragement

You can open an issue or email me to let me know what you think.

If you want to encourage me, even with a "token gesture", you can ["patreonize me"](https://www.patreon. com/andrewmackenzie)

Thanks for Reading this far!

Andrew

Dependencies

~23–30MB
~635K SLoC