#serde #serialization #no-std #iter #streaming

no-std serde_deser_iter

Iterate through serialized sequences allowing to aggregate them without deserializing to an allocated collection

1 unstable release

0.1.0 Oct 16, 2023

#713 in Encoding


325 lines

This crate offers two different ways to deserialize sequences without allocating.


Given the following JSON:

    {"id": 0, "name": "bob", "subscribed_to": ["rust", "knitting", "cooking"]},
    {"id": 1, "name": "toby 🐶", "subscribed_to": ["sticks", "tennis-balls"]},
    {"id": 2, "name": "alice", "subscribed_to": ["rust", "hiking", "paris"]},
    {"id": 3, "name": "mark", "subscribed_to": ["rust", "rugby", "doctor-who"]},
    {"id": 4, "name": "vera", "subscribed_to": ["rust", "mma", "philosophy"]}

we can process it without allocating a 5-sized vector of items as follow:

use serde_deser_iter::top_level::DeserializerExt;
# use std::{fs::File, io::BufReader, path::PathBuf, collections::HashSet};
/// The type each item in the sequence will be deserialized to.
struct DataEntry {
    // Not all fields are needed, but we could add "name"
    // and "id".
    subscribed_to: Vec<String>,

fn main() -> anyhow::Result<()> {
    # let example_json_path: PathBuf = [env!("CARGO_MANIFEST_DIR"), "examples", "data.json"]
    #   .iter()
    #   .collect();
    let buffered_file: BufReader<File> = BufReader::new(File::open(example_json_path)?);
    let mut json_deserializer = serde_json::Deserializer::from_reader(buffered_file);
    let mut all_channels = HashSet::new();

    json_deserializer.for_each(|entry: DataEntry| all_channels.extend(entry.subscribed_to))?;
    println!("All existing channels:");
    for channel in all_channels {
        println!("  - {channel}")

Top-level vs deep


The top_level module offers the most user friendly and powerful way to deserialize sequences. However, it is restricted to sequences defined at the top-level. For example it can work on each {"name": ...} from the following JSON

    {"name": "object1"},
    {"name": "object2"},
    {"name": "object3"}

but not if they are deeper in the structure:

    "result": [
        {"name": "object1"},
        {"name": "object2"},
        {"name": "object3"}


The deep module allows working on sequences located at any depth (and even nested one, though cumbersomely). However it does not allow to run closures on the iterated items, only functions, and its interface is less intuitive than top_level.