3 unstable releases
new 0.2.0 | Dec 10, 2024 |
---|---|
0.1.1 | Dec 5, 2024 |
0.1.0 | Dec 4, 2024 |
#367 in Rust patterns
258 downloads per month
120KB
2K
SLoC
shape
: a decidable static shape system for JSON
This library implements a Rust-based type system powerful enough to represent any kind of JSON data, offering type-theoretic operations like simplification, satisfaction testing, child shape selection, union and intersection shapes, delayed shape binding, flexible error handling, and more.
[!CAUTION] This library is still in early-stage development, so you should not expect its API to be fully stable until the 1.0.0 release.
Installation
This crate provides a library, so installation means adding it as a dependency
to your Cargo.toml
file:
cargo add shape
Documentation
See the cargo doc
-generated documentation
for detailed information about the Shape
struct and ShapeCase
enum.
The Shape
struct
pub(crate) type Ref<T> = std::sync::Arc<T>;
#[derive(Clone, PartialEq, Eq)]
pub struct Shape {
case: Ref<ShapeCase>,
case_hash: u64,
}
impl Shape {
pub(crate) fn new_from_simplified(case: ShapeCase) -> Shape {
let case = Ref::new(case);
let case_hash = case.compute_hash();
Shape { case, case_hash }
}
}
To support flexible recombinations of shapes and their subshapes, the top-level
Shape
struct wraps a reference counted ShapeCase
enum variant. Reference
counting not only simplifies sharing subtrees among different Shape
structures, but also prevents rustc
from complaining about the Shape
struct
referring to itself without indirection.
Since this ShapeCase
value is immutable, we can precompute and cache its hash
in the case_hash
field, then manually implement the std::hash::Hash
trait to
ignore the case
field in favor of this cached case_hash
field:
impl std::hash::Hash for Shape {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.case_hash.hash(state);
}
}
This allows Shape
hashing to reuse previously computed subtree hashes, rather
than rehashing the entire tree each time a hash is requested.
Obtaining a Shape
Instead of tracking whether a given ShapeCase
has been simplified or not, we
can simply mandate that Shape
always wraps simplified shapes.
This invariant is enforced by restricting how Shape
instances can be
(publicly) created: all Shape
instances must come either from calling
ShapeCase::simplify
, or from calling the crate-internal static method
Shape::new_from_simplified
.
impl ShapeCase {
pub fn simplify(self) -> Shape {
// Since this static method is crate-private, the ShapeCase::simplify
// is the only public way to create a Shape.
Shape::new_from_simplified(self.simplify_internal())
}
}
lazy_static! {
static ref BOOL_SHAPE: Shape = ShapeCase::Bool(None).simplify();
// ... additional static Shape instances ...
}
impl Shape {
// For convenience, Shape provides a number of static helper methods like
// Shape::bool() and Shape::bool_value(true).
pub fn bool() -> Shape {
// A common BOOL_SHAPE can be returned whenever Shape::bool() is called.
// This does not guarantee every Shape representing the Bool type will
// share the memory of this BOOL_SHAPE, but it helps reduce unnecessary
// memory allocations.
BOOL_SHAPE.clone()
}
pub fn bool_value(value: bool) -> Shape {
ShapeCase::Bool(Some(value)).simplify()
}
// Additional helpers...
pub fn int() -> Shape;
pub fn int_value(value: i64) -> Shape;
pub fn float() -> Shape;
pub fn string() -> Shape;
pub fn string_value(value: &str) -> Shape;
pub fn null() -> Shape;
pub fn none() -> Shape;
pub fn empty_object() -> Shape;
pub fn empty_map() -> IndexMap<String, Shape>;
pub fn object(fields: IndexMap<String, Shape>, rest: Option<Shape>) -> Shape;
pub fn array(prefix: &[Shape], tail: Option<Shape>) -> Shape;
pub fn tuple(shapes: &[Shape]) -> Shape;
pub fn list(of: Shape) -> Shape;
pub fn empty_array() -> Shape;
pub fn one(shapes: &[Shape]) -> Shape;
pub fn all(shapes: &[Shape]) -> Shape;
pub fn error(message: &str) -> Shape;
pub fn error_with_range(message: &str, range: OffsetRange) -> Shape;
pub fn error_with_partial(message: &str, partial: Shape) -> Shape;
pub fn error_with_partial_and_range(message: &str, partial: Shape, range: OffsetRange) -> Shape;
}
Notice that ShapeCase::simplify
takes ownership of its input self
value. If
this is not the behavior you want, you can always clone your ShapeCase
value
before simplifying it, thereby simplifying only the clone. However, you cannot
use an unsimplified ShapeCase
as a child of another ShapeCase
value, since
child shapes are always wrapped with Shape
.
Testing supershape-subshape acceptance
To test whether the set of all values accepted by one Shape
is a subset of the
set of all values accepted by another Shape
, use the
supershape.accepts(&subshape) -> bool
method, or its inverse
subshape.satisfies(&supershape) -> bool
.
For example, a Shape::one
union shape accepts any member shape of the union,
let int_string_union = Shape::one(&[Shape::int(), Shape::string()]);
assert!(int_string_union.accepts(&Shape::int()));
assert!(int_string_union.accepts(&Shape::string()));
assert!(Shape::int().satisfies(&int_string_union));
assert!(Shape::string().satisfies(&int_string_union));
assert_eq!(Shape::int().satisfies(&int_string_union), false);
assert_eq!(Shape::string().satisfies(&int_string_union), false);
assert_eq!(int_string_union.accepts(&Shape::float()), false);
Error satisfaction
A ShapeCase::Error
variant generally represents a failure of shape processing,
but it can also optionally report Some(partial)
shape information in cases
when there is a likely best guess at what the shape should be.
For this reason, a ShapeCase::Error
shape either satisfies/accepts itself
trivially (according to ==
equality), or it can define a partial
shape to
satisfy shapes that accept that partial
shape.
This partial: Option<Shape>
field allows errors to provide guidance
(potentially with chains of multiple errors) without interfering with the
accepts/satisfies logic.
let error = Shape::error_with_partial("Expected an integer", Shape::int());
assert!(error.accepts(&Shape::int()));
assert_eq!(error.accepts(&Shape::float()), false);
assert!(Shape::int_value(42).satisfies(&error));
assert_eq!(Shape::float().satisfies(&error), false);
The null
singleton and the None
shape
ShapeCase::Null
represents the singleton null
value found in JSON. It
satisfies and accepts only itself and no other shapes, except unions that allow
null
as a member, or errors that wrap null
as a partial shape.
ShapeCase::None
represents the absence of a value, and is often used to
represent optional values. Like null
, None
is satisfied by (accepts) only
itself and no other shapes (except unions that include None
as a member, or
errors that wrap None
as a partial shape for some reason).
When either null
or None
participate in a Shape::one
union shape, they are
usually preserved (other than being deduplicated) because they represent
distinct possibilities. However, ::Null
and ::None
do have a noteworthy
difference of behavior when simplifying ::All
intersection shapes.
When null
participates in a ShapeCase::All
intersection shape, it "poisons"
the intersection and causes the whole thing to simplify to null
. This allows a
single intersection member shape to override the whole intersection, which is
useful for reporting certain kinds of error conditions (especially in GraphQL).
By contrast, None
does not poison intersections, but is simply ignored. This
makes sense if you think of Shape::all
intersections as merging their member
shapes: when you merge None
with another shape, you get the other shape back,
because None
imposes no additional expectations.
Dependencies
~6.5–9MB
~160K SLoC