1 unstable release

0.6.0 Aug 20, 2021

#1578 in Parser implementations

Download history 7767/week @ 2023-07-29 6313/week @ 2023-08-05 6766/week @ 2023-08-12 7414/week @ 2023-08-19 6811/week @ 2023-08-26 7284/week @ 2023-09-02 7021/week @ 2023-09-09 5313/week @ 2023-09-16 5897/week @ 2023-09-23 5430/week @ 2023-09-30 3605/week @ 2023-10-07 4386/week @ 2023-10-14 4867/week @ 2023-10-21 6151/week @ 2023-10-28 6372/week @ 2023-11-04 5122/week @ 2023-11-11

23,285 downloads per month

MIT/Apache

63KB
1.5K SLoC

PHP serialization support for serde

Allows serialization and deserialization to/from PHP's serialize()/unserialize() format. See the documentation for details: docs.rs/php_serde.


lib.rs:

PHP serialization format support for serde

PHP uses a custom serialization format through its serialize() and unserialize() methods. This crate adds partial support for this format using serde.

An overview of the format can be seen at https://stackoverflow.com/questions/14297926/structure-of-a-serialized-php-string, details are available at http://www.phpinternalsbook.com/php5/classes_objects/serialization.html.

What is supported?

  • Basic and compound types:

    PHP type Rust type
    boolean bool
    integer i64 (automatic conversion to other types supported)
    float f64 (automatic conversion to f32 supported)
    strings Vec<u8> (PHP strings are not UTF8)
    null decoded as None
    array (non-associative) tuple structs or Vec<_>
    array (associative) regular structs or HashMap<_, _>
  • Rust Strings are transparently UTF8-converted to PHP bytestrings.

Out-of-order arrays

PHP arrays can be created "out of order", as they store every array index as an explicit integer in the array. Thus the following code

$arr = array();
$arr[0] = "zero";
$arr[3] = "three";
$arr[2] = "two";
$arr[1] = "one";

results in an array that would be equivalent to ["zero", "one", "two", "three"], at least when iterated over.

Because deserialization does not buffer values, these arrays cannot be directly serialized into a Vec. Instead they should be deserialized into a map, which can then be turned into a Vec if desired.

A second concern are "holes" in the array, e.g. if the entry with key 1 is missing. How to fill these is typically up to the user.

The helper function deserialize_unordered_array can be used with serde's deserialize_with decorator to automatically buffer and order things, as well as plugging holes by closing any gaps.

What is missing?

  • PHP objects
  • Non-string/numeric array keys, except when deserializing into a HashMap
  • Mixed arrays. Array keys are assumed to always have the same key type (Note: If this is required, consider extending this library with a variant type).

Example use

Given an example data structure storing a session token using the following PHP code

<?php
$serialized = serialize(array("user", "", array()));
echo($serialized);

and thus the following output

a:3:{i:0;s:4:"user";i:1;s:0:"";i:2;a:0:{}}

, the data can be reconstructed using the following rust code:

use serde::Deserialize;
use php_serde::from_bytes;

#[derive(Debug, Deserialize, Eq, PartialEq)]
struct Data(Vec<u8>, Vec<u8>, SubData);

#[derive(Debug, Deserialize, Eq, PartialEq)]
struct SubData();

let input = br#"a:3:{i:0;s:4:"user";i:1;s:0:"";i:2;a:0:{}}"#;
assert_eq!(
    from_bytes::<Data>(input).unwrap(),
    Data(b"user".to_vec(), b"".to_vec(), SubData())
);

Likewise, structs are supported as well, if the PHP arrays use keys:

<?php
$serialized = serialize(
    array("foo" => true,
          "bar" => "xyz",
          "sub" => array("x" => 42))
);
echo($serialized);

In Rust:

#[derive(Debug, Deserialize, Eq, PartialEq)]
struct Outer {
    foo: bool,
    bar: String,
    sub: Inner,
}

#[derive(Debug, Deserialize, Eq, PartialEq)]
struct Inner {
    x: i64,
}

let input = br#"a:3:{s:3:"foo";b:1;s:3:"bar";s:3:"xyz";s:3:"sub";a:1:{s:1:"x";i:42;}}"#;
let expected = Outer {
    foo: true,
    bar: "xyz".to_owned(),
    sub: Inner { x: 42 },
};

let deserialized: Outer = from_bytes(input).expect("deserialization failed");

assert_eq!(deserialized, expected);

Optional values

Missing values can be left optional, as in this example:

<?php
$location_a = array();
$location_b = array("province" => "Newfoundland and Labrador, CA");
$location_c = array("postalcode" => "90002",
                    "country" => "United States of America");
echo(serialize($location_a) . "\n");
echo(serialize($location_b) . "\n");
# -> a:1:{s:8:"province";s:29:"Newfoundland and Labrador, CA";}
echo(serialize($location_c) . "\n");
# -> a:2:{s:10:"postalcode";s:5:"90002";s:7:"country";
#         s:24:"United States of America";}

The following declaration of Location will be able to parse all three example inputs.

#[derive(Debug, Deserialize, Eq, PartialEq)]
struct Location {
    province: Option<String>,
    postalcode: Option<String>,
    country: Option<String>,
}

Full roundtrip example

use serde::{Deserialize, Serialize};
use php_serde::{to_vec, from_bytes};

#[derive(Debug, Deserialize, Eq, PartialEq, Serialize)]
struct UserProfile {
    id: u32,
    name: String,
    tags: Vec<String>,
}

let orig = UserProfile {
    id: 42,
    name: "Bob".to_owned(),
    tags: vec!["foo".to_owned(), "bar".to_owned()],
};

let serialized = to_vec(&orig).expect("serialization failed");
let expected = br#"a:3:{s:2:"id";i:42;s:4:"name";s:3:"Bob";s:4:"tags";a:2:{i:0;s:3:"foo";i:1;s:3:"bar";}}"#;
assert_eq!(serialized, &expected[..]);

let profile: UserProfile = from_bytes(&serialized).expect("deserialization failed");
assert_eq!(profile, orig);

Dependencies

~205–490KB
~11K SLoC