#byte #serialization #quick #concise #serde-json #language #dbor

serde_dbor

A quick and concise serialization language designed for Rust

3 stable releases

Uses old Rust 2015

1.0.2 Jun 8, 2018
1.0.1 Jun 7, 2018
1.0.0 Mar 6, 2018

#17 in #concise

MIT license

90KB
2K SLoC

DBOR - Dq's Binary Object Representation

DBOR is a serialization format based on CBOR, designed for Rust, and optimized for speed and file size. It uses buffered reading and writing systems when interacting with io streams for maximum efficiency.

Example Usage

(derived from serde_json's tutorial)

Cargo.toml

[dependencies]
serde = "*"
serde_derive = "*"
serde_dbor = "*"

main.rs

extern crate serde;
extern crate serde_dbor;

#[macro_use]
extern crate serde_derive;

use serde_dbor::Error;

#[derive(Serialize, Deserialize)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>
}

fn example<'a>(data: &'a [u8]) => Result<(), Error> {
    // Parse the data into a Person object.
    let p: Person = serde_dbor::from_slice(data)?;

    // Do things just like with any other Rust data structure.
    println!("Please call {} at the number {}", p.name, p.phones[0]);

    Ok(())
}

Spec

DBOR, just like CBOR, is composed of instruction bytes and additional content bytes. However, in DBOR, every item needs to be described before its content, meaning that indefinite-length arrays, strings, or maps are not allowed because they would require a termination byte at the end of the item. An instruction byte is split up into two sections of 3 bits and 5 bits, respectively. The first 3 bits define the type of the item, and the last 5 are a parameter for that item, which in some cases can be the value of the item itself. For example, an unsigned integer with a value of 21 would be stored as 0x15, or 0b000 10101, because type 0 (0b000) is a uint and the byte has enough space left over to encode the number 21 (0b10101).

When an instruction byte indicates that the parameter is of a certain size n, the next n bytes will be used for that parameter, and then afterwards will be the content of the item described by the instruction byte. For example, a u16 parameter takes up the two bytes immediately after the instruction byte. However, when serializing a u16, it may be shortened into a u8 or into the instruction byte itself. Also, it should be noted that DBOR stores multi-byte integers and floats in little endian because it makes serialization/deserialization on most machines faster (x86 uses little endian).

Instruction Bytes

Type ID Encoded Type Parameter Descriptions
0b000 (0) uint
  • 0-23 - values 0-23
  • 24 - u8
  • 25 - u16
  • 26 - u32
  • 27 - u64
  • 28-31 - reserved
0b001 (1) int
  • 0-15 - values 0-15
  • 16-23 - values -8--1
  • 24 - i8
  • 25 - i16
  • 26 - i32
  • 27 - i64
  • 28-31 - reserved
0b010 (2) misc
  • 0 - false
  • 1 - true
  • 2 - ()
  • 3 - None
  • 4 - f32
  • 5 - f64
  • 6-31 - reserved
0b011 (3) variant (enum)
  • 0-23 - variant ids 0-23
  • 24 - variant id as u8
  • 25 - variant id as u16
  • 26 - variant id as u32
  • 27 - named variant (see below)
  • 28-31 - reserved
0b100 (4) seq (array/tuple/struct)
  • 0-23 - length of 0-23
  • 24 - length as u8
  • 25 - length as u16
  • 26 - length as u32
  • 27 - length as u64 (only on 64-bit machines)
  • 28-31 - reserved
0b101 (5) bytes (string/byte array)
  • 0-23 - length of 0-23
  • 24 - length as u8
  • 25 - length as u16
  • 26 - length as u32
  • 27 - length as u64 (only on 64-bit machines)
  • 28-31 - reserved
0b110 (6) map
  • 0-23 - length of 0-23
  • 24 - length as u8
  • 25 - length as u16
  • 26 - length as u32
  • 27 - length as u64 (only on 64-bit machines)
  • 28-31 - reserved
0b111 (7) reserved
  • 0-31 - reserved

Named Variant Byte

  • 0-247 - name length of 0-247
  • 248 - name length as u8
  • 249 - name length as u16
  • 250 - name length as u32
  • 251 - name length as u64 (only on 64-bit machines)
  • 252-255 - reserved

Note: serialization using named variants isn't currently implemented, but deserialization is.

Example Data

Rust Code

struct Data {
    some_text: String,
    a_small_number: u64,
    a_byte: u8,
    some_important_numbers: Vec<u16>,
}

let data = Data {
    some_text: "Hello world!",
    a_small_number: 0x04,
    a_byte: 0x27,
    some_important_numbers: vec![
        0x1234,
        0x6789,
        0xabcd,
    ]
}

Annotated Hex Dump of DBOR

84                    # Seq(4)
  ac                    # Bytes(12)
    48 65 6c 6c 6f 20...
    77 6f 72 6c 64 21     # "Hello world!"
  04                    # uint(4)
  18                    # u8
    27                    # 0x27
  83                    # Seq(3)
    19                    # u16
      34 12                 # 0x1234
    19                    # u16
      89 67                 # 0x6789
    19                    # u16
      cd ab                 # 0xabcd

Dependencies

~115–360KB