#protobuf #prost #grpc #proc-macro #tonic

prost-unwrap

A procedural macro for prost-generated structs validation and type-casting

3 stable releases

1.1.0 May 1, 2024
1.0.1 Apr 28, 2024
0.1.5 Apr 12, 2024

#344 in Data structures

Apache-2.0

12KB

NOTE: This crate is in early stages of development and is not yet intended to be used in production applications. The crate API is a subject for breaking changes.

This crate is designed to bridge the gap between gRPC's design principles and the idiomatic Rust approach to data structures. It automates the generation of "mirror" data structures that unwrap all necessary fields from Option<T> and provides TryFrom implementations for converting from the original autogenerated structs.

Why This Matters

With the evolution of protobuf to version 3, the notion of required fields was phased out. In the context of gRPC, this means every nested field in your messages becomes optional by default. The rationale is to shift field validation responsibilities from the protocol to the application level, avoiding the complexities required fields introduce in evolving data contracts.

Prost and Tonic, adhering to these updated protobuf and gRPC conventions, generate Rust data structures where non-primitive nested fields are encapsulated in Option<T>. While this aligns with Rust's safety and nullability features, it can be cumbersome, especially when certain fields are inherently required for your data structures to be logically coherent. In Rust, the preferred paradigm is to prevent invalid states at compile time, a goal not fully met by Prost's autogenerated structs, which often lead to excessive unwrapping and referencing.

The proposed solution is to create "sanitized" mirror structures where all essential fields are directly accessible, not wrapped in Option. By implementing TryFrom<OriginalMessage> for MirrorMessage, these structures adhere to Rust's design principles, ensuring data integrity and enhancing code clarity.

The primary challenge with this approach is the extensive boilerplate code required to create and maintain these mirror structures. This crate introduces a procedural macro to eliminate this boilerplate, automatically generating the necessary code, simplifying maintenance, and enabling you to focus on your application's logic.

Quick start guide

Let's say we have a bar.proto file like this, located in the ./proto/foo directory in your crate:

syntax = "proto3";

package foo.bar;

message MsgA {
    int32 f1 = 1;
}

message MsgB {
    MsgA f1 = 1;
    MsgA f2 = 2;
    repeated MsgA f3 = 3;
}

First, generate a rust source code with prost or tonic. Please refer to the crates documentation for full explanation of building process.

This quick start guide will use prost.

First, to use prost-unwrap, you need to specify the out directory for prost. This is needed because prost-unwrap needs to read these files to generate mirroring structs, and the OUT_DIR env variable is unavailable at the time procedural macro is expanded. To avoid commiting the generated code, add .gitignore file into out directory, with the *.rs entry.

Add the out_dir call to the prost-build config in your build.rs. tonic-build offers similar option.

use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let inner_proto = Path::new("proto/foo/bar.proto");
    let include_dir = Path::new("proto");

    prost_build::Config::new()
        .out_dir(".proto_out")
        .compile_protos(&[inner_proto], &[include_dir])?;

    Ok(())
}

After running the cargo build, you should see the generated code (foo.bar.rs file) in the out directory.

When working with prost-generated structs, it is required to organize the generated code into nested modules that will mirror your protobuf package structure. For the example above, you should encapsulate the generated Rust code as follows (wrap all the generated code, including package modules, with generated module to isolate it).

pub mod generated {
    pub mod foo {
        pub mod bar {
            include!(".proto/foo.bar.rs"));
        }
    }
}

For prost-unwrap-generated code we will add a separate module tree with the similar structure.

pub mod unwrapped {
    pub mod foo {
        pub mod bar {
            // insert the prost_unwrap:include! macro call here
        }
    }
}

prost-unwrap provides the include! macro to include the generated code into your source. The macro takes an argument list in form of a method call chain.

prost_unwrap:include!(
    with_original_mod(crate::generated)
    .with_this_mod(crate::unwrapped)
    .from_source(foo::bar, ".proto/foo.bar.rs")
    .with_struct(MsgB, [f1])
);

This configuration instructs prost-unwrap to:

  • Extract structs and enums from the file ".proto/foo.bar.rs".
  • Copy the MsgB struct from crate::generated::foo::bar, converting the f1 field from Option<T> to T.
  • Generate the TryFrom and Into traits for all transferred structs and enums.

With the generated and unwrapped code, you can perform conversions like these:

fn a(msg: crate::unwrapped::foo::bar::MsgB)
-> crate::generated::foo::bar::MsgB
{
    msg.into() // Converts unwrapped struct back to the original form.
}

fn b(msg: crate::generated::foo::bar::MsgB)
-> Result<crate::unwrapped::foo::bar::MsgB, Box<dyn Error>>
{
    msg.try_into()? // Attempts conversion, returning an error if the
                    // 'msg.f1' field is 'None'.
}

Take a look at the integration tests directory to find another use cases. Note: ui directory contains negative (failing) tests, do not consider those! :)

include! macro breakdown

The include! macro takes a pseudocode, in a form of method call chain, as an argument. The pseudo-method calls may be arranged in any order.

The call chain must contain one call of with_original_mod, with_this_mod and from_source. At least one with_struct or with_enum must present as well.

with_original_mod

Specifies the absolute path (starting with crate::) of the wrapper module, containing the original generated source code. The path is required because of some scoping limitations rust procedural macros have.

Example:

prost_unwrap:include!(
    with_original_mod(crate::generated)
);
with_this_mod

Specifies the absolute path of the wrapper module, containing the prost-unwrap-generated source code.

prost_unwrap:include!(
    with_this_mod(crate::unwrapped)
);
from_source

Specifies the source code location along with the relative module path this code is wrapped in. The relative path must be the same within original code wrapper and the unwrapped code wrapper.

In most cases this path will match the package name insode your proto file (foo.bar in our case) and the file name (foo.bar.rs in our case).

prost_unwrap:include!(
    from_source(com::acme, ".proto/com.acme.rs")
);
with_struct

Specifies the struct relative path with the list of fields that need to be unwrapped from Option<T> into T.

prost_unwrap:include!(
    with_struct(AcmeMessage, [field1, field2, field3])
);
with_enum

Specifies the enum relative path, that also needs to be included into the generated code (see "Known issues" section).

prost_unwrap:include!(
    with_enum(AcmeEnum)
);

Generated code

prost-unwrap::include! will generate the following pieces of code (along with the copied structs and enums):

  • The Error struct, implementing the Debug, Display and std::error::Error traits.
  • Helper functions for converting original structs into copied structs, if wrapped into Option<T> (optional fields) or Vec<T> (repeated fields).

The copied structs and enums have prost-related attributes stripped:

  • Message derive for structs;
  • Enumeration derive for enums;
  • field-specific attributes for structs and enums.

One can always inspect the generated code using the cargo-expand.

Features to be implemented

  • Partial copying: for now prost-unwrap copies all the structs and enums it can find in the linked source code. It is possible to copy a subset of data structures, if this subset is self-contained, meaning that members of the subset only reference the members of the subset.

  • Item suffix: since the original and copied struct have the same name, the structs need to be aliased somehow to have both of them in the same scope. By implementing suffixes, it will be possible to automatically rename copied structs.

Known issues

  • Useless with_enum options: since prost-unwrap copies all structs and enums, this option is useless until partial copying is implemented.
  • Copied structs and enums lack Debug and Default trait implementations (these are provided by prost Message and Enumeration derives, which are stripped).
  • Tests do not cover all possible usage scenarios.

Contributing

This crate is in early stages of development. If you encounter any surprising behavior, unclear documentation or other problem, please, feel free to create an issue on github.

If you found a bug, please create a minimal reproducible scenario (proto file plus rust code) and put it into github issue.

If you feel proficient in rust enough to criticize the source code inefficiencies, do the same.

Dependencies

~0.9–1.4MB
~29K SLoC