1 unstable release
0.1.0 | Jul 6, 2022 |
---|
#1028 in Data structures
Used in sidex-cli
6KB
Sidex
Sidex is a format- and language-agnostic data structure and API definition language with a focus on simplicity, extensibility, and developer ergonomics. Sidex aims to simplify data exchange between different programming languages and platforms via potentially multiple serialization formats.
🚧 Status: Although we already use Sidex in production, it is still experimental. Use at your own risk!
✨ Features
- Schema-first definition of data structures and RPC-like APIs.
- Designed for format- and language-agnostic definitions.
- Modern algebraic data types and non-null by default.
- Extensible with user-defined opaque types.
- Designed for interoperability, e.g., with JSON Schema.
- VS Code extension for increased productivity.
- Out-of-the-box support for Rust, TypeScript, and JSON.
🚀 Getting Started
Sidex is currently distributed via Cargo and crates.io. To install Sidex run:
cargo install sidex-cli
Then, to create a new Sidex definition named my_def
run:
sidex new my_def
Every Sidex definition consists of a flat collection of modules located in the modules
directory. Here is a simple example of a module you could place in the file person.sidex
:
opaque Uuid // This is an opaque user-defined type.
alias PersonId: Uuid // This is a type alias.
enum Role {
Admin,
User,
}
struct Person {
id: PersonId,
name: string,
email?: string, // This field is optional.
role: Role,
children: [PersonId], // A sequence of person ids.
}
enum GetPersonResult {
NotFound,
Found: Person,
}
fun get_person_by_id(id: PersonId) -> GetPersonResult
To check a definition for validity run:
sidex check
Please have a look at the recipes for further examples on how to use Sidex.
⚙️ The Sidex Language
At the core of Sidex is the Sidex language for defining data types and function types.
The core of Sidex is only concerned with such types and nothing else.
📦 Data Types
Sidex is based on five kinds of data types:
-
Opaque types are opaque to Sidex, i.e., their internal structure is a black box.
Opaque types are defined with the
opaque
keyword. Opaque types are nominal, i.e., opaque types defined separately are always distinct even if they have the same name. -
Enumeration types define unions with tagged variants of different types.
Enumeration types are defined with the
enum
keyword. Enumeration types are nominal, i.e., enumeration types defined separately are always distinct even if they agree on all their variants. -
Struct types define structures with labeled fields of different types.
Struct types are defined with the
struct
keyword. Struct types are nominal, i.e., struct types defined separately are always distinct even if they agree on all their fields. -
Sequence types define sequences of elements of the same type.
Sequence types are created with
[T]
whereT
denotes the element type. Sequence types are structural, i.e., two sequence types with the same element type are identical. -
Map types define mappings from keys of some type to values of some type.
Map types are created with
[K: V]
whereK
denotes the key type andV
denotes the value type. Map types are structural, i.e., two map types with the same key and value type are identical.
Sidex comes with built-in primitive types for strings, integers, and booleans. Technically, these primitive types are not any different from user-defined opaque types. These primitive types are:
string
: For sequences of Unicode code points.i8
,i16
,i32
,i64
: For signed integers of different bit width.u8
,u16
,u32
,u64
: For unsigned integers of different bit width.bool
: For booleans.
In addition there is the void
type for indicating the absence of any data.
Using opaque types, you can define your own primitives, e.g., for UUIDs:
opaque Uuid
The structure of opaques types can be specified externally, e.g., using JSON Schema.
📡 Function Types
Taking inspiration from RPC and FFI, Sidex allows defining function types with the fun
keyword. Every function type consists of a sequence of named arguments with their own respective type and a return type. At its core, Sidex does not presuppose any protocol or other mechanism for invoking such functions.
🤝 Exchanging Data
Data exchange can be quite complex and involves multiple concerns which Sidex aims to separate.
📜 Language Mapping
To be useful, Sidex definitions need to be mapped to type or class definitions of some programming language, e.g., Rust or TypeScript. We refer to such a mapping as a language mapping:
┌──────────────────┐ Language Mapping ┌─────────────────┐
│ Sidex Definition │ ────────────────────► │ Target Language │
└──────────────────┘ └─────────────────┘
Note that a language mapping might involve certain tradeoffs to be made. For instance, in case of TypeScript, a map type can be mapped either to Object
or to Map
, and, in case of Rust, there are also multiple different types of maps available, e.g., HashMap
or BTreeMap
. Furthermore, depending on the language, certain data types may not be mappable at all due to language-specific constraints.
Hence, the goal of the Sidex project is to provide tools and infrastructure for mapping Sidex definitions to different programming languages without imposing any particular mapping. Using the sidex
crate as a basis, you can define your own mappings and even generate additional boilerplate such as constructors and getters. If something cannot be sensibly mapped, a tool is free to generate an error as a last resort.
Sidex aims to provide mappings for some languages out-of-the-box with sane defaults.
Note that a language mapping is itself completely independent from how data may be serialized and how functions may be invoked. It can also be useful without ever exchanging any data.
📩 Serialization Formats
To exchange data between different languages, it needs to be serialized into some common format. To this end, a format mapping from a Sidex definition to the serialization format is necessary:
┌──────────────────┐ Format Mapping ┌──────────────────────┐
│ Sidex Definition │ ──────────────────► │ Serialization Format │
└──────────────────┘ └──────────────────────┘
Note that the format mapping is supposed to be language-independent. It merely describes how certain Sidex types are mapped to the serialization format and its types.
Again, Sidex does not impose any restrictions on the serialization format, however, it aims to provide some out-of-the-box mappings to common formats with sane defaults.
For user-defined opaque types, specific format mappings have to be provided.
🔗 Serialization Binding
Once we have fixed a language mapping and a format mapping, we need to bind both together using a serialization binding. A serialization binding is language-specific and format-specific. It takes serialized data as per the format mapping and transforms it into data structures as per the language mapping (known as deserialization) and vice versa (known as serialization).
🤔 Rationale
Why schema-first?
A schema-first approach has multiple advantages over definitions in a programming language: (1) It allows focusing on the important aspects of the data being exchanged. (2) It allows developing tooling independent of any programming language. (3) It enables the independent evolution and adaption of the definition language. (4) It can be used independently of a particular programming language.
Why yet another language?
Existing approaches are often specific to certain serialization formats, do not explicitly support algebraic data types, do not support arbitrary user-defined opaque types, have nullable fields by default, or/and are overly complex by supporting much more structures/types than Sidex.
⚖️ Licensing
Sidex is licensed under MIT. Unless you explicitly state otherwise, any contributions intentionally submitted for inclusion in this project shall be licensed under MIT without any additional terms or conditions.
Made with ❤️ by Silitics.
Dependencies
~1.3–2MB
~42K SLoC