59 releases
0.6.1 | Nov 14, 2024 |
---|---|
0.5.0 | Nov 12, 2024 |
0.4.19 | May 4, 2024 |
0.4.12 | Mar 12, 2024 |
0.3.26 | Nov 30, 2023 |
#213 in Encoding
754 downloads per month
Used in 11 crates
(7 directly)
42KB
421 lines
Native model
Add interoperability on the top of serialization formats like bincode, postcard etc.
See concepts for more details.
Goals
- Interoperability: Allows different applications to work together, even if they are using different versions of the data model.
- Data Consistency: Ensure that we process the data expected model.
- Flexibility: You can use any serialization format you want. More details here.
- Performance: A minimal overhead (encode: ~20 ns, decode: ~40 ps). More details here.
Usage
Application 1 (DotV1) Application 2 (DotV1 and DotV2)
| |
Encode DotV1 |--------------------------------> | Decode DotV1 to DotV2
| | Modify DotV2
Decode DotV1 | <--------------------------------| Encode DotV2 back to DotV1
| |
use native_model::native_model;
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 1)]
struct DotV1(u32, u32);
#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 2, from = DotV1)]
struct DotV2 {
name: String,
x: u64,
y: u64,
}
impl From<DotV1> for DotV2 {
fn from(dot: DotV1) -> Self {
DotV2 {
name: "".to_string(),
x: dot.0 as u64,
y: dot.1 as u64,
}
}
}
impl From<DotV2> for DotV1 {
fn from(dot: DotV2) -> Self {
DotV1(dot.x as u32, dot.y as u32)
}
}
// Application 1
let dot = DotV1(1, 2);
let bytes = native_model::encode(&dot).unwrap();
// Application 1 sends bytes to Application 2.
// Application 2
// We are able to decode the bytes directly into a new type DotV2 (upgrade).
let (mut dot, source_version) = native_model::decode::<DotV2>(bytes).unwrap();
assert_eq!(dot, DotV2 {
name: "".to_string(),
x: 1,
y: 2
});
dot.name = "Dot".to_string();
dot.x = 5;
// For interoperability, we encode the data with the version compatible with Application 1 (downgrade).
let bytes = native_model::encode_downgrade(dot, source_version).unwrap();
// Application 2 sends bytes to Application 1.
// Application 1
let (dot, _) = native_model::decode::<DotV1>(bytes).unwrap();
assert_eq!(dot, DotV1(5, 2));
- Full example here.
Serialization format
You can use default serialization formats via the feature flags, like:
[dependencies]
native_model = { version = "0.1", features = ["bincode_2_rc"] }
Each feature flag corresponds to a specific minor version of the serialization format. In order to avoid breaking changes, the default serialization format is the oldest one.
bincode_1_3
: bincode v1.3 (default)bincode_2_rc
: bincode v2.0.0-rc3postcard_1_0
: postcard v1.0rpm_serde_1_3
: rmp-serde v1.3
Custom serialization format
Define a struct with the name you want. This struct must implement native_model::Encode
and native_model::Decode
traits.
Full examples:
Others examples, see the default implementations:
Notice
native_model
provides implementations that rely on metadata-less formats and serde
.
There are known issues with some serde
advanced features such as:
#[serde(flatten)]
#[serde(skip)]
#[serde(skip_deserializing)]
#[serde(skip_serializing)]
#[serde(skip_serializing_if = "path")]
#[serde(tag = "...")]
#[serde(untagged)]
Or types implementing similar strategies such as serde_json::Value
.
The rmp-serde
serialization format can optionally support them serializing structs as maps, the RmpSerdeNamed
struct is provided to support this use-case.
Data model
Define your model using the macro native_model
.
Attributes:
id = u32
: The unique identifier of the model.version = u32
: The version of the model.with = type
: The serialization format that you use for the Encode/Decode implementation. Setup here.from = type
: Optional, the previous version of the model.type
: The previous version of the model that you use for the From implementation.
try_from = (type, error)
: Optional, the previous version of the model with error handling.type
: The previous version of the model that you use for the TryFrom implementation.error
: The error type that you use for the TryFrom implementation.
use native_model::native_model;
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 1)]
struct DotV1(u32, u32);
#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 2, from = DotV1)]
struct DotV2 {
name: String,
x: u64,
y: u64,
}
// Implement the conversion between versions From<DotV1> for DotV2 and From<DotV2> for DotV1.
impl From<DotV1> for DotV2 {
fn from(dot: DotV1) -> Self {
DotV2 {
name: "".to_string(),
x: dot.0 as u64,
y: dot.1 as u64,
}
}
}
impl From<DotV2> for DotV1 {
fn from(dot: DotV2) -> Self {
DotV1(dot.x as u32, dot.y as u32)
}
}
#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 3, try_from = (DotV2, anyhow::Error))]
struct DotV3 {
name: String,
cord: Cord,
}
#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct Cord {
x: u64,
y: u64,
}
// Implement the conversion between versions From<DotV2> for DotV3 and From<DotV3> for DotV2.
impl TryFrom<DotV2> for DotV3 {
type Error = anyhow::Error;
fn try_from(dot: DotV2) -> Result<Self, Self::Error> {
Ok(DotV3 {
name: dot.name,
cord: Cord { x: dot.x, y: dot.y },
})
}
}
impl TryFrom<DotV3> for DotV2 {
type Error = anyhow::Error;
fn try_from(dot: DotV3) -> Result<Self, Self::Error> {
Ok(DotV2 {
name: dot.name,
x: dot.cord.x,
y: dot.cord.y,
})
}
}
Codecs
native_model
comes with several optional built-in serializer features available:
-
- This is the default codec.
- Warning: This codec may not work with all serde-derived types.
-
- Enable the
bincode_2_rc
feature and use thenative_model::bincode_2_rc::Bincode
attribute to havenative_db
use this crate for serializing & deserializing. - Warning: This codec may not work with all serde-derived types.
- Enable the
-
- Enable the
postcard_1_0
feature and use thenative_model::postcard_1_0::PostCard
attribute. - Warning: This codec may not work with all serde-derived types.
- Enable the
-
- Enable the
rmp_serde_1_3
feature and use thenative_model::rmp_serde_1_3::RmpSerde
attribute.
- Enable the
Codec example:
As example, to use rmp-serde
:
- In your project's
Cargo.toml
file, enable thermp_serde_1_3
feature for thenative_model
dependency.- Be sure to check
crates.io
for the most recentnative_model
version number.
- Be sure to check
[dependencies]
serde = { version = "1.0", features = [ "derive" ] }
native_model = { version = "0.4", features = [ "rmp_serde_1_3" ] }
- Assign the
rmp_serde_1_3
codec to yourstruct
using thewith
attribute:
use native_model::native_model;
#[derive(Clone, Default, serde::Deserialize, serde::Serialize)]
#[native_model(id = 1, version = 1, with = native_model::rmp_serde_1_3::RmpSerde)]
struct MyStruct {
my_string: String,
// etc.
}
Additional reading
You may also want to check out David Koloski's Rust serialization benchmarks for help selecting the codec (i.e. bincode_1_3
, rmp_serde_1_3
, etc.) that's best for your project.
Status
Early development. Not ready for production.
Concepts
In order to understand how the native model works, you need to understand the following concepts.
- Identity(
id
): The identity is the unique identifier of the model. It is used to identify the model and prevent to decode a model into the wrong Rust type. - Version(
version
) The version is the version of the model. It is used to check the compatibility between two models. - Encode: The encode is the process of converting a model into a byte array.
- Decode: The decode is the process of converting a byte array into a model.
- Downgrade: The downgrade is the process of converting a model into a previous version of the model.
- Upgrade: The upgrade is the process of converting a model into a newer version of the model.
Under the hood, the native model is a thin wrapper around serialized data. The id
and the version
are twice encoded with a little_endian::U32
. That represents 8 bytes, that are added at the beginning of the data.
+------------------+------------------+------------------------------------+
| ID (4 bytes) | Version (4 bytes)| Data (indeterminate-length bytes) |
+------------------+------------------+------------------------------------+
Full example here.
Performance
Native model has been designed to have a minimal and constant overhead. That means that the overhead is the same whatever the size of the data. Under the hood we use the zerocopy crate to avoid unnecessary copies.
👉 To know the total time of the encode/decode, you need to add the time of your serialization format.
Resume:
- Encode: ~20 ns
- Decode: ~40 ps
data size | encode time (ns) | decode time (ps) |
---|---|---|
1 B | 19.769 ns - 20.154 ns | 40.526 ps - 40.617 ps |
1 KiB | 19.597 ns - 19.971 ns | 40.534 ps - 40.633 ps |
1 MiB | 19.662 ns - 19.910 ns | 40.508 ps - 40.632 ps |
10 MiB | 19.591 ns - 19.980 ns | 40.504 ps - 40.605 ps |
100 MiB | 19.669 ns - 19.867 ns | 40.520 ps - 40.644 ps |
Benchmark of the native model overhead here.
Dependencies
~1.6–2.4MB
~42K SLoC