#binary-encoding #binary-format #encode-decode #pickle #world #vodozemac #libolm

matrix-pickle

A simple binary encoding format used in the Matrix world

4 releases

0.2.1 Sep 10, 2024
0.2.0 Mar 25, 2024
0.1.1 Oct 6, 2023
0.1.0 Nov 16, 2022

#620 in Encoding

Download history 956/week @ 2024-08-23 1554/week @ 2024-08-30 1370/week @ 2024-09-06 1492/week @ 2024-09-13 1633/week @ 2024-09-20 1362/week @ 2024-09-27 1803/week @ 2024-10-04 1445/week @ 2024-10-11 1194/week @ 2024-10-18 861/week @ 2024-10-25 1293/week @ 2024-11-01 1239/week @ 2024-11-08 1259/week @ 2024-11-15 1325/week @ 2024-11-22 2050/week @ 2024-11-29 1457/week @ 2024-12-06

6,300 downloads per month
Used in 15 crates (via vodozemac)

MIT license

20KB
282 lines

Build Status codecov License Docs

A simple binary encoding format used in the Matrix world.

The matrix-pickle binary encoding format is used in the libolm and vodozemac cryptographic libraries.

How to use

The simplest way to use matrix-pickle is using the derive macros:

use anyhow::Result;
use matrix_pickle::{Encode, Decode};

fn main() -> Result<()> {
    #[derive(Clone, Debug, Decode, Encode, PartialEq, Eq)]
    struct MyStruct {
        public_key: [u8; 32],
        data: Vec<u8>,
    }
    
    let data = MyStruct {
        public_key: [5u8; 32],
        data: vec![1, 2, 3],
    };
    
    let encoded = data.encode_to_vec()?;
    let decoded = MyStruct::decode_from_slice(&encoded)?;
    
    assert_eq!(data, decoded);

    Ok(())
}

Format definition

matrix-pickle encodes most values without any metadata, the bytes that are part of the struct in most cases get encoded verbatim.

The table bellow defines how common types are encoded.

Type Example value Encoded value Comment
u8 255 [FF] Encoded verbatim
bool true [01] Converted to an u8 before encoding
[u8; N] [1u8, 2u8] [01, 02] Encoded verbatim
u32 16 [00, 00, 00, 10] Encoded as a byte array in big endian form
usize 32 [00, 00, 00, 20] Converted to an u32 before encoding
&[T] &[3u8, 4u8] [00, 00, 00, 02, 03, 04] The length gets encoded first, then each element

Derive support

The crate supports deriving Encode and Decode implementations for structs and enums as long as the types inside them implement Encode and Decode as well.

Structs

The derive support for structs simply encodes each field of a struct in the order they are defined, for example:

use std::io::Write;
use matrix_pickle::{Encode, EncodeError};

struct Foo {
    first: [u8; 32],
    second: Vec<u8>,
}

impl Encode for Foo {
    fn encode(&self, writer: &mut impl Write) -> Result<usize, EncodeError> {
        let mut ret = 0;

        // Encode the first struct field.
        ret += self.first.encode(writer)?;
        // Now encode the second struct field.
        ret += self.second.encode(writer)?;

        Ok(ret)
    }
}

Enums

Enums on the other hand first encode the number of the variant as an u8, then the value of the enum.

Only enums with variants that contain a single associated data value are supported.

use std::io::Write;
use matrix_pickle::{Encode, EncodeError};

enum Bar {
    First(u32),
    Second(u32),
}

impl Encode for Bar {
    fn encode(&self, writer: &mut impl Write) -> Result<usize, EncodeError> {
        let mut ret = 0;

        match self {
            Bar::First(value) => {
                // This is our first variant, encode a 0u8 first.
                ret += 0u8.encode(writer)?;
                // Now encode the associated value.
                ret += value.encode(writer)?;
            },
            Bar::Second(value) => {
                // This is our second variant, encode a 1u8 first.
                ret += 1u8.encode(writer)?;
                // Now encode the associated value.
                ret += value.encode(writer)?;
            },
        }

        Ok(ret)
    }
}

Encoding and decoding secrets

For decoding values which are meant to be secret, make sure to box the array. We have a helper attribute that reminds you that values that are meant to be kept secret should be boxed.

Simply annotate any struct field using the #[secret] attribute.

If a value that is meant to be a secret is not boxed a compiler error will be thrown. For example, this snippet won't compile.

use matrix_pickle::{Encode, Decode};

#[derive(Encode, Decode)]
struct Key {
    #[secret]
    private: [u8; 32],
    public: [u8; 32],
}

This example on the other hand compiles.

use matrix_pickle::{Encode, Decode};

#[derive(Encode, Decode)]
struct Key {
    #[secret]
    private: Box<[u8; 32]>,
    public: [u8; 32],
}

Comparison to bincode

The binary format is similar to what the bincode crate provides with the following config:

let config = bincode::config::standard()
    .with_big_endian()
    .with_fixed_int_encoding()
    .skip_fixed_array_length();

The two major differences to the format are:

  • bincode uses u64 to encode slice lengths
  • matrix-pickle uses u32 to encode slice lengths

Other differences are:

  • No support to configure the encoding format, if you need to tweak the format, use bincode.
  • No unsafe code. Optimized for simplicity, not for pure performance

Dependencies

~0.3–1MB
~20K SLoC