75 releases (7 breaking)

0.15.1 Apr 11, 2024
0.15.0-alpha.5 Mar 29, 2024
0.12.0-alpha.2 Dec 26, 2023
0.11.0 Nov 28, 2023
0.8.0 Jul 27, 2023

#507 in Development tools

Download history 1032/week @ 2024-01-05 688/week @ 2024-01-12 585/week @ 2024-01-19 889/week @ 2024-01-26 1073/week @ 2024-02-02 1398/week @ 2024-02-09 3524/week @ 2024-02-16 4396/week @ 2024-02-23 5428/week @ 2024-03-01 4673/week @ 2024-03-08 4288/week @ 2024-03-15 2952/week @ 2024-03-22 2866/week @ 2024-03-29 4540/week @ 2024-04-05 5674/week @ 2024-04-12 4312/week @ 2024-04-19

18,096 downloads per month
Used in 36 crates (via re_types)

MIT/Apache

640KB
14K SLoC

re_types_builder

Part of the rerun family of crates.

Latest version Documentation MIT Apache

This crate implements Rerun's code generation tools.

These tools translate language-agnostic IDL definitions (flatbuffers) into code.

You can generate the code with just codegen.


lib.rs:

This crate implements Rerun's code generation tools.

These tools translate language-agnostic IDL definitions (flatbuffers) into code. They are invoked by re_types's build script (build.rs).

Organization

The code generation process happens in 4 phases.

1. Generate binary reflection data from flatbuffers definitions.

All this does is invoke the flatbuffers compiler (flatc) with the right flags in order to generate the binary dumps.

Look for compile_binary_schemas in the code.

2. Run the semantic pass.

The semantic pass transforms the low-level raw reflection data generated by the first phase into higher level objects that are much easier to inspect/manipulate and overall friendlier to work with.

Look for objects.rs.

3. Fill the Arrow registry.

The Arrow registry keeps track of all type definitions and maps them to Arrow datatypes.

Look for arrow_registry.rs.

4. Run the actual codegen pass for a given language.

We currently have two different codegen passes implemented at the moment: Python & Rust.

Codegen passes use the semantic objects from phase two and the registry from phase three in order to generate user-facing code for Rerun's SDKs.

These passes are intentionally implemented using a very low-tech no-frills approach (stitch strings together, make liberal use of unimplemented, etc) that keep them flexible in the face of ever changing needs in the generated code.

Look for codegen/python.rs and codegen/rust.rs.

Error handling

Keep in mind: this is all build-time code that will never see the light of runtime. There is therefore no need for fancy error handling in this crate: all errors are fatal to the build anyway.

Make sure to crash as soon as possible when something goes wrong and to attach all the appropriate/available context using anyhow's with_context (e.g. always include the fully-qualified name of the faulty type/field) and you're good to go.

Testing

Same comment as with error handling: this code becomes irrelevant at runtime, and so testing it brings very little value.

Make sure to test the behavior of its output though: re_types!

Understanding the subtleties of affixes

So-called "affixes" are effects applied to objects defined with the Rerun IDL and that affect the way these objects behave and interoperate with each other (so, yes, monads. shhh.).

There are 3 distinct and very common affixes used when working with Rerun's IDL: transparency, nullability and plurality.

Broadly, we can describe these affixes as follows:

  • Transparency allows for bypassing a single layer of typing (e.g. to "extract" a field out of a struct).
  • Nullability specifies whether a piece of data is allowed to be left unspecified at runtime.
  • Plurality specifies whether a piece of data is actually a collection of that same type.

We say "broadly" here because the way these affixes ultimately affect objects in practice will actually depend on the kind of object that they are applied to, of which there are 3: archetypes, components and datatypes.

Not only that, but objects defined in Rerun's IDL are materialized into 3 distinct environments: IDL definitions, Arrow datatypes and native code (e.g. Rust & Python).

These environment have vastly different characteristics, quirks, pitfalls and limitations, which once again lead to these affixes having different, sometimes surprising behavior depending on the environment we're interested in. Also keep in mind that Flatbuffers and native code are generally designed around arrays of structures, while Arrow is all about structures of arrays!

All in all, these interactions between affixes, object kinds and environments lead to a combinatorial explosion of edge cases that can be very confusing when it comes to (de)serialization code, and even API design.

When in doubt, check out the rerun.testing.archetypes.AffixFuzzer IDL definitions, generated code and test suites for definitive answers.

Dependencies

~12–26MB
~413K SLoC