5 releases

0.2.1 Feb 26, 2020
0.2.0 Feb 26, 2020
0.1.2 Oct 21, 2019
0.1.1 Jan 1, 2018
0.1.0 Jan 1, 2018

#2065 in Rust patterns

MIT license

24KB
312 lines

Lazy Transducer Build Status

Lazy transducers are generic, lazy, parallel, indexable iterators transforming one data source into n output data types.

See the online documentation for more information.

Using

Add this to your Cargo.toml

[dependencies]
lazy_transducer = "0.1"

Example

extern crate lazy_transducer;
extern crate rayon;

use rayon::prelude::*;
use lazy_transducer::LazyTransducer;

fn main() {
  let data = [0xdeadbeefu32, 0xcafed00d];
  let lt: LazyTransducer<&[u32], u64> = LazyTransducer::new(&data, 2, |input, idx| input[idx] as u64);

  let cafedood = lt.get(1).expect("has 2 elements");
  assert_eq!(cafedood, 0xcafed00d);

  lt.into_par_iter().for_each(|elem| {
    println!("{}", elem);
  });
}

lib.rs:

Introduction

A lazy transducer is a declarative specification for the transformation of one type of data to another.

The transformer is called the transducer, which receives the original source input, and an index corresponding to the ith element, and returns the corresponding ith element out of that source.

Importantly, lazy transducers are:

  1. Lazy - it never parses any elements unless you request it to do so
  2. Iterable - one can iterate over every element
  3. Indexable - accessing an element is O(1)
  4. Parallel - one can iterate in parallel over every element

When constructing a lazy transducer, it's important to remember that the transducer must be a fn with signature (Input, usize -> Output); a closure which does not capture its environment also can be used.

The input source needs to be copy; i.e., it should typically be a reference to Vec<_>, or a slice of bytes, etc.

Another detail that you should keep in mind is that the index is the ith element; therefore if your backing data source is bytes, you need to know the fixed length of your data structure you are parsing out of the byte slice. It is up to the transducer implementation to determine this.

In some cases, the size of the data structure/output element isn't known statically via sizeof, etc. In these cases it is suggested to pass this "context" in the input source during construction as a tuple, e.g.: (datasource, context), and have your transducer use the datasource + context + the index to correctly marshall your data structure out of the bytes.

This is the approach that the scroll-based transducer takes. See also the bincode example for a similar approach.

The parallel implementation uses rayon.

Example

extern crate lazy_transducer;
extern crate rayon;

use rayon::prelude::*;
use lazy_transducer::LazyTransducer;

use std::mem::size_of;
use std::mem::transmute;

let bytes: Vec<u8> = vec![1u8, 0, 2, 0, 3, 0, 4, 0];
let lt: LazyTransducer<_, &u16> = LazyTransducer::new(&bytes, 4, |input, index| {
    let start = size_of::<u16>() * index;
    unsafe { transmute::<_, &u16>(&input[start]) }
});

// now iterate in parallel
lt.into_par_iter().for_each(|n| {
  // do stuff
  println!("{}", n);
});

Dependencies

~1.5–2.1MB
~44K SLoC