12 releases
Uses new Rust 2024
| new 0.1.11 | Oct 10, 2025 |
|---|---|
| 0.1.10 | Oct 9, 2025 |
| 0.1.9 | Aug 30, 2025 |
#919 in Database interfaces
408 downloads per month
755KB
14K
SLoC
✅ Protocheck
Protocheck is a crate that allows you to leverage protovalidate annotations to automatically generate validation logic for the structs generated from your protobuf messages.
This allows you to define your validation schemas only once, directly in your protobuf files (or in rust code, using protoschema), and then use libraries such as this one or others like protovalidate-es in the Typescript ecosystem to execute the validation logic.
➡️ Getting started
Visit the protocheck-build docs to learn how to set up protocheck in your build.rs script.
📓 Noteworthy features
1. It does not require reflection, except at build time.
This is a major benefit for two reasons.
First, it removes the need to include a reflection library in the consuming app's binary.
And, most importantly, it avoids the overhead that is introduced by using reflection to determine the structure of a message when validating it.
Rather than using reflection, this crate leverages the TryIntoCelValue derive macro to generate a method called try_into_cel_value which will directly convert any given struct (or field) into the appropriate cel Value (only failing in case of a Duration or Timestamp field being out of the allowed range for chrono types).
2. It uses native rust code for validation except for custom Cel rules.
Unlike other similar libraries, all of the standard validators are implemented in rust code. This means that the cel interpreter (provided by the cel crate) is used only for custom rules explicitely defined in Cel, and can be disabled altogether if custom rules are not used.
3. Extra safety checks for rules definitions
Because of human error, some of these situations may arise:
- A list of allowed values and a list of forbidden values have some items in common
- A string field is trying to use
BytesRules - An "enum.in" rule (list of allowed values for an enum field) includes values that are not part of that enum
- An "lt" rule (meaning 'less than') specifies a value that is smaller than the "gt" (greater than) rule
- ... and other corner cases
This crate handles these situations by emitting a compilation error, which will report the specific field (and the specific values) involved in the error.
Example:
message Oopsie {
string mystring = 1 [(buf.validate.field).string = {
min_len: 10
max_len: 2
}];
}
Error message:
error: Error for field myapp.v1.Oopsie.mystring: min_len cannot be larger than max_len
4. Strenghtened compile-time safety for Cel programs
When the protobuf_validate proc macro is being processed, it will attempt to create a test case for any given Cel expression being used, generating some default values for the given message or field and trying to execute a Cel program with those defaults.
This ensures that if a Cel expression is fundamentally invalid (for example for a type mismatch), the error will be caught at compile time and not at runtime. (With some caveats explained below)
5. Lazy initialization
All Cel programs are generated using LazyLock so they are only initialized once. The same thing goes for other static elements being used in the validators, such as regexes or allowed/forbidden list of values.
☑️ How to validate messages
After the validate method has been added to a struct, validating it is as simple as calling my_struct.validate().
The validate method returns a Result<(), Violations>, where the Violations struct contains a vector of individual Violation elements, which contain the context behind a given validation error, such as the parent messages (if the field was part of a nested message) of the invalid field, along with the error message and the rule id for that given rule.
Both Violations and the invidivual Violation structs come with several utility methods, such as violation_by_rule_id, which allows you to select a particular violation from the list, or field_path_str, which conveniently takes a list of FieldPathElement and turns it into a single string path such as person.friends.0.address.street_name.
The protocheck-proc-macro crate also adds a generic trait ProtoValidator that calls the validate method.
Example:
(The examples are taken from the testing crate, they show as untested just because it's a separate crate that has its own build script)
message JediFight {
Anakin anakin = 1;
ObiWan obi_wan = 2;
}
message ObiWan {
bool has_high_ground = 1 [(buf.validate.field).bool.const = true];
}
message Anakin {
bool has_high_ground = 1 [(buf.validate.field).bool.const = false];
}
#[test]
fn example() {
let obi_wan = ObiWan {
has_high_ground: false,
};
let anakin = Anakin {
has_high_ground: true,
};
let jedi_fight = JediFight {
anakin: Some(anakin),
obi_wan: Some(obi_wan),
};
let Violations { violations } = jedi_fight.validate().unwrap_err();
violations.iter().for_each(|v| {
println!(
"Field path: {}, Error message: {}",
v.field_path_str().unwrap(),
v.message()
)
});
}
Output:
Field path: anakin.has_high_ground, Error message: must be equal to false
Field path: obi_wan.has_high_ground, Error message: must be equal to true
The tests crate contains many other examples of validation schemas being implemented.
⚙️ Custom validation with Cel
With the cel feature (enabled by default), you can also specify some validation rules defined with the Cel syntax, which can be applied to entire structs or to singular fields.
Example:
Let's change the above to this:
message ObiWan {
// Message-level validation
option (buf.validate.message).cel = {
id: "obi-wan.high_ground"
message: "obi-wan must have the high ground."
expression: "this.has_high_ground == true"
};
// Field-level validation
bool has_high_ground = 1 [(buf.validate.field).cel = {
id: "obi-wan.high_ground"
message: "obi-wan must have the high ground."
expression: "this == true"
}];
}
Now the output will be:
Field path: anakin.has_high_ground, Error message: must be equal to false
Field path: obi_wan, Error message: obi-wan must have the high ground.
Field path: obi_wan.has_high_ground, Error message: obi-wan must have the high ground.
This is particularly useful for cases when we need to apply message-wide validation which analyzes more than one field at a time.
The typical example would be:
message User {
option (buf.validate.message).cel = {
id: "passwords_match"
message: "the two passwords do not match"
expression: "this.password == this.confirm_password"
};
string password = 1;
string confirm_password = 2;
}
let user = User {
password: "abc".to_string(),
confirm_password: "abcde".to_string(),
};
let Violations { violations } = user.validate().unwrap_err();
println!(
"Field path: {}, Error message: {}",
// Message-wide violations do not have a field path unless they are nested in another message
// But they do have a rule id
violations[0].rule_id(),
violations[0].message()
);
Outcome:
Field path: passwords_match, Error message: the two passwords do not match
📘 Protoschema integration
If you are interested in composing your protobuf files programmatically, and with the benefits of type safety, reusable elements and LSP integration, with a particular focus on making the definition of validation rules a quick and type-safe process, you might want to check out my other crate, protoschema.
⚠️ Caveats and warnings
-
The protovalidate rule buf.validate.message.oneof (the one used to make custom oneofs which allow repeated and map fields) is currently not supported.
-
While the compile-time check for the validity of a Cel expression helps to catch most if not all errors relative to the Cel program compilation and execution, it is still very encouraged to have some tests that trigger the validation logic at runtime (it's just as easy as calling
.validate()once again) to be absolutely sure that the Cel program is not causing any issues.This is because the Cel validation function can obviously not panic and crash the whole app if a Cel program failed to execute, so it will just return a generic error to the user while logging the actual error.
This means that if there is an unattended error, then it would silently keep generating these generic and unhelpful error messages for users until it would be reported or noticed in the logs.
But the good news is that the compile time check prevents the majority of these situations, and adding a very simple test on top of that can eradicate that problem entirely.
-
If your message has a reserved rust keyword as a field name, your cel expression should reflect that. So if a field is named
type, the output struct will have a field namedr#type, and your cel expression should refer to it usingthis['r#type'], NOTthis.type. -
For certain cases where the instructions are conflicting but could be intentional, such as using the "const" rule for a field while also having other validators, the other rules will simply be ignored and no error will be shown. This is to allow for cases when you want to have a temporary override for a field's validation without needing to remove the other validators.
-
Validation for
bytesfields only works when usingbytes::Bytesas the rust type for them. -
The types for the well known protobuf messages must be imported from
proto-types(re-exported in this crate in thetypesmodule). These are based on theprost-typesimplementation, with some extra helpers and methods that make validation smoother or even possible at all in some cases.compile_protos_with_validatorsautomatically takes care of callingcompile_well_known_typesand assigning all of the.google.protobuftypes to the ones defined inproto-types. The same thing goes for the types belonging to theprotovalidatespecification.
Dependencies
~8–11MB
~207K SLoC