1 unstable release

0.1.0 Mar 14, 2024

#1045 in Rust patterns

MIT/Apache

26KB
496 lines

Frayed

Unfused and unashamed iterators under the hood, conventional iterators out the door

Introduction

Rust iterators[^0] come in a few varieties: fused, unfused, and now frayed. The variety is determined by how it behaves after .next() returns None.

pub trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}

Fused

A fused iterator once .next() returns None will only ever return None. It is said to be exhausted at the first None. This behavior can be guaranteed with the .fuse() method that returns a std::iter::Fuse iterator.

Unfused

The majority of iterators are unfused. They have no programmatic guarantees like Fuse but the producers and consumers have tacitly agreed that it's impolite to ask .next() of an iterator who has already said None. It is expected to be exhausted at the first None.

Frayed

Frayed iterators delight in furnishing further elements after returning None. They can economically represent multiple sequences. They are not indefatigably barbaric, however; a frayed iterator is expected to be exhausted when it returns two Nones consecutively.

Usage

Suppose we have an iterator that represents multiple sequences. For instance SevenIteris a unfused iterator that represents these sequences: [1, 2], [4, 5], and [7].

use frayed::{Frayed, FrayedTools, FraughtTools};

/// This iterator skips every number divisible by 3 up to seven.
struct SevenIter(u8);

/// SevenIter's .next() returns Some(1), Some(2), None, Some(4), Some(5), None, 
/// Some(7), None, None, and so on.
impl Iterator for SevenIter {
    type Item = u8;
    fn next(&mut self) -> Option<u8> {
        self.0 += 1;
        (self.0 % 3 != 0 && self.0 <= 7).then_some(self.0)
    }
}

/// Mark iterator with `.frayed() or impl the `Frayed` marker trait.
let frayed_iter = SevenIter(0).frayed();
/// Defray the frayed iterator into an iterator of iterators.
for subiter in &frayed_iter.defray() {
    for i in subiter {
        print!("{i} ");
    }
    println!()
}

The above will produce this output

1 2 
4 5
7 

Extensions

FrayedTools

FrayedTools extends iterators marked Frayed.

FraughtTools

FraughtTools extends regular iterators but often either accepts frayed iterators as arguments or returns frayed iterators.

Q&A

Why restrict FrayedTools to only Frayed iterators?

Because dealing with an iterator that conceptually represents many iterators is confusing: Sometimes you want to map over all the elements. Sometimes you want to map over the subsequences. This way the compiler can help us.

For instance if we forget to mark our iterator as "frayed", we can't "defray" it.

let frayed_iter = SevenIter(0); // .frayed();
let _ = &frayed_iter.defray(); // Not marked frayed. No `defray()` method.

Defray isn't an iterator. How can I "map" over it?

Defray implements IntoIterator for &Defray. The problems appear when one wants to return a Defray but say .map() its output. If one can do what they need with the underlying frayed iterator, use defrayed.into_inner() to retrieve it, process it, and consider .defray()-ing it again.

If one wants to process the iterators and not the underlying elements, that remains an open question as to how best to do that. Perhaps Defray itself could provide a map<F,Y>(self, f: F) -> Map<Defray<Iter<Item=X>>, F, Y> where F: FnMut(Iter<Item=X>) -> Y) function.

Motivation

When writing an iterator implementation by hand that represents multiple sequences, it is usually easy to write a Iterator<Item = Vec<T>> even though allocating and collecting a Vec is not essential or desired. However, writing a Iterator<Item = SubIter<T>> requires tricky machinery[^1] if one hopes to respect the API that entails, e.g., the SubIters can be consumed out of order or dropped.

If one writes instead a "frayed" iterator Iterator<Item = T>---where the None represents the end of a subsequence not the end of the iterator---that is often much easier. One can consume these iterators with a some care but they remain unconventional and surprising.

fn raw_consume_unfused<T: std::fmt::Display>(frayed: impl Iterator<Item = T>) {
    let mut frayed = frayed.peekable();
    loop {
        // Consume the subsequence.
        for i in frayed.by_ref() {
            print!("{} ", i);
        }
        println!();
        // Check for second None that means frayed iterator is exhausted.
        if frayed.peek().is_none() {
            break;
        }
    }
}

The initial motivation of this crate is to make it easy for "frayed" iterators to be consumed by the uninitiated. Consider instead this code:

use frayed::*;
fn raw_consume_frayed<T: std::fmt::Display>(frayed: impl Iterator<Item = T> + Frayed) {
    for subiter in &frayed.defray() {
        for i in subiter {
            print!("{} ", i);
        }
    }
}

But it would be even better if producers kept their frayed iterators under the covers and then exposed the abstractions that we're all used to.

fn chunks(&self) -> frayed::Defray<Iter> {
    // ...
}

fn main() {
    let obj = ...;
    for subiter in &obj.chunks() {
        for i in subiter {
            print!("{} ", i);
        }
    }
}

[^0]: Rust iterators have a wonderfully succinct trait.

[^1]: See GroupBy implementation in itertools for an example.

No runtime deps