13 releases

0.1.1 Nov 8, 2024
0.1.0 Nov 6, 2024
0.0.11 Nov 29, 2022
0.0.9 Oct 13, 2022
0.0.1 Mar 30, 2022

#65 in #validator

Download history 1/week @ 2024-09-17 8/week @ 2024-09-24 6/week @ 2024-10-01 228/week @ 2024-11-05 66/week @ 2024-11-12

294 downloads per month
Used in substrait-validator

Apache-2.0

16KB
293 lines

Procedural macro crate for substrait-validator

This crate defines some #[derive] macros for substrait-validator, specifically for the types generated by prost-build. This is needed because prost-build on its own doesn't generate any introspection-like information for the protobuf structures, such as message type names as strings, which we want to be able to use in our parse tree.


lib.rs:

Procedural macro crate for substrait-validator-core.

The derive macros defined here are essentially an ugly workaround for the lack of any protobuf introspection functionality provided by prost. Basically, they take (the AST of) the code generated by prost and try to recover the needed protobuf message metadata from there. Things would have been a LOT simpler and a LOT less brittle if prost would simply provide this information via traits of its own, but alas, there doesn't seem to be a way to do this without forking prost, and introspection seems to be a non-goal of that project.

Besides being ugly, this method is rather brittle and imprecise when it comes to recovering field names, due to the various case conversions automatically done by protoc and prost. Some known issues are:

  • The recovered type name for messages defined within messages uses incorrect case conventions, as the procedural macros have no way of distinguishing packages from message definition scopes in the type path.
  • If the .proto source files use unexpected case conventions for various things, the resulting case conventions for types, field names, oneof variants, and enum variants will be wrong.
  • Whenever the .proto source files name a field using something that is a reserved word in Rust (notably type), prost will use a raw identifier to represent the name. This syntax is currently not filtered out from the recovered names, so a field named type becomes r#type. This is probably not a fundamental problem, though.

Ultimately, however, these names are only used for diagnostic messages and the likes. In the worst case, the above inconsistencies may confuse the user, but they should not affect the valid/invalid/maybe-valid result of the validator or cause compile- or runtime errors.

Dependencies

~240–680KB
~16K SLoC