9 releases
new 0.3.2 | Oct 26, 2024 |
---|---|
0.3.1 | Oct 22, 2024 |
0.3.0 | Sep 29, 2024 |
0.2.4 | Aug 23, 2024 |
0.1.0 | Aug 19, 2024 |
#380 in Encoding
411 downloads per month
69KB
1K
SLoC
marshal-rs
marshal-rs
is a Rust implementation of Ruby-lang's Marshal
.
This project is essentially just @savannstm/marshal, rewritten using Rust. It is capable of 🔥 BLAZINGLY FAST loading data from dumped Ruby Marshal files, as well as 🔥 BLAZINGLY FAST dumping it back to Marshal format.
Installation
cargo add marshal-rs
Quick overview
This crate has two main functions: load()
and dump()
.
load()
takes a &[u8]
, consisting of Marshal data bytes (that can be read using std::fs::read()
) as its only argument, and outputs serde_json::Value
(sonic_rs::Value
, if sonic
feature is enabled).
dump()
, in turn, takes Value
as its only argument and serializes it back to Vec<u8>
Marshal byte stream. It does not preserve strings' initial encoding, writing all strings as UTF-8 encoded.
Note
marshal-rs
does NOT write object links. That means that the output file size may be larger than initial. Otherwise, it has no effect on output file. I really do need help with object links writing. If you're a Ruby/Rust sénior and a megamind in terms of Marshal format, consider submitting a pull request to this repository or whatever.
If serializes Ruby data to JSON using the table:
Ruby object | Serialized to JSON |
---|---|
nil |
null |
1337 (Integer) |
1337 |
36893488147419103232 (Big Integer) |
{ __type: "bigint", value: "36893488147419103232" } (Plain object) |
13.37 (Float) |
13.37 |
"ligma" (String) |
"ligma" |
:ligma (Symbol) |
"__symbol__ligma" |
/lgma/i (Regex) |
{ "__type": "regexp", "expression": "lgma", flags: "i" } (Plain object) |
[] (Array) |
[] |
{} (Hash) |
{} (Plain object) |
Object.new (Including structs, modules etc.) |
{ "__class": "__symbol__Object", "__type": "object" } (Plain object) |
Strings
By default, Ruby strings, that include encoding instance variable, are serialized to JSON strings, and those which don't, serialized to { __type: "bytes", data: [...] }
objects.
This behavior can be controlled with string_mode
argument of load()
function.
StringMode::UTF8
tries to convert arrays without instance variable to string, and produces string if array is valid UTF8, and object otherwise.
StringMode::Binary
converts all strings to objects.
Objects and Symbols
For objects, that cannot be serialized in JSON (such as Objects and Symbols), marshal-rs
uses approach of stringifying and adding prefixes and properties. It stringifyies symbols and prefixes them with __symbol__
, and serializes objects' classes and types as __class
keys and __type
keys respectively.
Hash keys
For Hash keys, that in Ruby may be represented using Integer
, Float
, Object
etc, marshal-rs
tries to preserve key type with prefixing stringifiyed key with it type. For example, Ruby {1 => nil}
Hash will be converted to {"__integer__1": null}
object.
Instance variables
Instance variables always decoded as strings with __symbol__
prefix.
You can manage the prefix of instance variables using instance_var_prefix
argument in load()
and dump()
. Passed string replaces "@" instance variables' prefixes.
Unsafe code
This code uses UnsafeCell along with unsafe blocks multiple times in load() function. However, in current implementation, this unsafe code will NOT ever cause any data races or instabilities.
Quick example
use std::fs::read;
use marshal_rs::{load, dump};
fn main() {
// Read marshal data from file
// let marshal_data: Vec<u8> = read("./Map001.rvdata2").unwrap();
// For this example, we'll just take pre-defined marshal data
let marshal_data: Vec<u8> = [0x04, 0x08, 0x30].to_vec();
// Serializing to json
// load() takes a &[u8] as argument, so bytes Vec must be borrowed
let serialized_to_json: serde_json::Value = load(&marshal_data, None, None).unwrap();
// Here you may std::fs::write() serialized JSON to file
// Serializing back to marshal
// dump() requires owned Value as argument
let serialized_to_marshal: Vec<u8> = dump(serialized_to_json, None);
// Here you may std::fs::write() serialized Marshal data to file
}
MSRV
Minimum supported Rust version is 1.63.0.
References
License
Project is licensed under WTFPL.
Dependencies
~5–11MB
~172K SLoC