1 unstable release
Uses old Rust 2015
0.1.0 | Sep 25, 2017 |
---|
#2241 in Encoding
75KB
1.5K
SLoC
zlo
An encoder/decoder pair that uses a bit-compact binary encoding scheme. The
size of the encoded object will be almost same or smaller than the size that the
object takes up in memory in a running Rust program. It is a fork
of bincode and so resembles its API and
also includes familiar SizeLimit
objects.
It was made for use in networking in a multi-player game, fit to encode diffs of data sent very frequently but in small pieces over network, where bincode produces comparably large chunks of data and compressing that with common algorithms such as LZO tends to yield very tiny improvements over zlo-encoded data at expense of added long compression times.
Stability of the binary format is NOT guaranteed across major versions.
Example
#[macro_use]
extern crate serde_derive;
extern crate zlo;
use zlo::{serialize, deserialize, Infinite};
#[derive(Serialize, Deserialize, PartialEq, Debug)]
struct Entity {
x: f32,
y: f32,
}
#[derive(Serialize, Deserialize, PartialEq, Debug)]
struct World(Vec<Entity>);
fn main() {
let world = World(vec![Entity { x: 0.0, y: 4.0 }, Entity { x: 10.0, y: 20.5 }]);
let encoded: Vec<u8> = serialize(&world, Infinite).unwrap();
assert!(encoded.len() < 8 + 4 * 4);
let decoded: World = deserialize(&encoded[..]).unwrap();
assert_eq!(world, decoded);
}
Performance, output size
zlo deserializer is NOT zero-copy and can not be in majority of cases. This is because bit-unmangled data occurs rarely in this encoding.
Fastest primitives to process are unsigned ints and bools. Performance also depends on amount of data written. E.g. encoding small 64 bit int is just barely slower than it would be with a value of a type it fits.
zlo is primarily oriented to be used to serialize diffs of data, that is,
structs with lots of Option
s, small ints and diffed-zigzagged floats in them.
Under these conditions zlo can yield very compact data.
Vs Bincode
When encoding large 64 bit ints zlo can be up to 7 times slower than bincode. On average, zlo can be expected to be 3 to 5 times slower than bincode when encoding lots integers or floats. When encoding plain bytes zlo has comparable performance, though, zlo is not zero-copy so it may incur extra allocation overhead compared to bincode at deserialization.
Output size, on average, in comparison to bincode, can be expected to be up to
1.5 times smaller when encoding numbers and up to 8 times when encoding lots of
bools or Option::None
s.
Need to squeeze data in even less bits?
- Consider diffing floats, but rather by their parts (exponent and fraction).
- Or perhaps your value stays in a specific range known ahead? Consider then serializing it as an integer, unsigned or not.
- Is it multidimensional? Maybe you could infer dependent elements from just a single record.
- Serializing a multidimensional unit value such as unit vector or quaternion? Serialize it in a more compact way, for example, 2D unit vector could be made into just an angle.
See examples.
Lossy fraction coding
There are no plans for it. Additionally, even if lossy fractions are considered, it can only be expected after Rust issue #44580 is stabilized.
Details
Booleans are encoded as single bits, integers in a form similar but not equal to LEB128, floats are encoded in a somewhat complicated way described below, tuples and structs are encoded by encoding their fields one-by-one, and enums are encoded by first writing out the tag representing the variant and then the contents.
Unlike bincode, zlo has no configurable byte order because of variable amount of bytes written.
Implementation details to be aware of:
- Separate bits are written from LSB to MSB.
- Unsigned integers are encoded using the following principle:
PC | |
---|---|
0 | if integer > 0, then write bit 1, else write bit 0 and return |
1 | write next 8 bits of the integer |
2 | if this is the most significant byte of the whole type, then return |
3 | if there are no more bits to be written, then write bit 0 and return |
4 | write bit 1 |
5 | goto 1 |
- Signed integers are zigzag-encoded (same as protobuf) and then encoded same as unsigned integers
- Floats are encoded the following way: sign bit is always written, exponent is
written either as 2 bits or full. Fraction is written either as higher 16
bits only in case of
f32
, higher 32 bits in case off64
or full if it doesn't fit. isize
/usize
are encoded asvariable i64
/u64
, for portability.- Enum variants are encoded as a variable
u32
instead of ausize
.u32
is enough for all practical uses. str
is encoded as(variable u64, &[u8])
, where theu64
is the number of bytes contained in the encoded string.
License
zlo is licensed under MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
Dependencies
~110–355KB