5 releases

0.2.3 May 20, 2021
0.2.2 May 2, 2021
0.2.1 Apr 29, 2021
0.2.0 Apr 28, 2021
0.1.0 Feb 3, 2021

#2215 in Parser implementations

45 downloads per month
Used in 3 crates

MIT/Apache

120KB
3.5K SLoC

Messy Json

Rust JSON Parser for dynamically structured documents

Introduction

The rust ecosystem allows for very good compile-time implementation of JSON deserializer to rust structure, however, things get a bit more sparse when it come to run-time deserialization of dynamically structured objects. This crate approaches this problems in a simple manner, resembling serde_json's Value.

Example

	use messy_json::*;
	use serde::de::DeserializeSeed;

    let nested_string = MessyJson::from(MessyJsonInner::String(MessyJsonScalar::new(false)));
    let schema: MessyJson = MessyJson::from(MessyJsonInner::Obj(MessyJsonObject::from(MessyJsonObjectInner::new(
        vec![(arcstr::literal!("hello"), nested_string)]
            .into_iter()
            .collect(),
        false,
    ))));
    let value = r#"
	{
		"hello": "world"
	}
	"#;

	let mut deserializer = serde_json::Deserializer::from_str(value);
	let parsed: MessyJsonValueContainer = schema.builder(MessyJsonSettings::default()).deserialize(&mut deserializer).unwrap();
	
	println!("{:#?}", parsed)

Performance

This crate is more effecient than serde_json's Value when all the fields are required. The performance par with serde_json's Value when some fields are optional.

However this crate is far behind deserializing using the proc-macro from serde (which is not dynamically structured at all).

This gap could be filled using a custom arena-based allocator, like Bumpalo when the Allocator trait is merged into stable.

This crate implements benchmarks. The following graphs were run on a machine with the following specs:

  • CPU : Intel i9-9900K @ 4.7Ghz
  • RAM : 32 Gb RAM @ 2133 Mhz
  • Kernel : 5.11.16-arch1-1
  • Rust : rustc 1.51.0 (2fd73fabe 2021-03-23)

In the following benchmarks, the messy_json crate is compared with deserializer from the serde_json's Value and macro-generated deserializer using serde's derive.

Dummy object

The following benchmark consists of deserializing the JSON Document

{
	"hello":
	{
		"hola": "world"
	}
}

the accepted schema should looks like the following:

use std::borrow::Cow;

struct DummyObjNested<'a> {
    hola: Cow<'a, str>,
}

struct DummyObj<'a> {
    hello: DummyObjNested<'a>,
}

The results show that messy_json is slower than macro-generated deserializer but faster than using serde_json's Value.

Partial object

The following benchmark consists of deserializing the JSON Document

{
	"hello":
	{
		"hola": "world"
	}
}

the accepted schema should looks like the following:

use serde::{Serialize, Deserialize};
use std::borrow::Cow;

#[derive(Serialize, Deserialize)]
struct PartialObjNested<'a> {
    hola: Cow<'a, str>,
}

#[derive(Serialize, Deserialize)]
struct PartialObj<'a> {
    hello: PartialObjNested<'a>,
    coucou: Option<Cow<'a, str>>,
    coucou1: Option<Cow<'a, str>>,
    coucou2: Option<Cow<'a, str>>,
}

The results show that messy_json is slower than macro-generated deserializer and on par with serde_json's Value. When using optional values, this crate has to check it has met all of the mandatory values for each object, hence the performance regression. In the future, when the alloc_api of the Rust language is merged into stable, optimizations could be put in place reducing the time necessary to check for missing fields.

Simple object

The following benchmark consists of deserializing the JSON Document

{
	"hello": "world"
}

the accepted schema should looks like the following:

use std::borrow::Cow;
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct SimpleObj<'a> {
    hello: Cow<'a, str>,
}

The results show that messy_json is slower than macro-generated deserializer but is still faster than serde_json's Value.

Dependencies

~1–1.7MB
~33K SLoC