5 releases
Uses new Rust 2024
| new 0.3.3 | Feb 5, 2026 |
|---|---|
| 0.3.2 | Feb 3, 2026 |
| 0.3.1 | Feb 3, 2026 |
| 0.1.1 | Dec 8, 2025 |
| 0.1.0 | Dec 8, 2025 |
#85 in Testing
135KB
2K
SLoC
protoc-gen-fake: Protocol Buffer Fake Data Generator
protoc-gen-fake is a protoc plugin that generates fake data based on your Protocol Buffer schema definitions and custom annotations.
The fake data it generates is binary data in exactly the same format as defined by that schema. It helps developers quickly create realistic mock data at scale for testing, development, and demonstrations.
Features
- Schema-driven Data Generation: Generates fake data directly from your
.protofile definitions. - Customizable Fake Data: Use
(gen_fake.fake_data)options to provide specific data types (e.g., names, addresses, emails), and minimum and maximum count for repeated or optional fields. - Internationalization (i18n): Generate fake data in various languages and locales.
- Flexible Output: Choose between binary or Base64 encoded output, and control output paths.
(Note: Base64 encoding provides compatibility with
protoc's string-based output mechanism, but for most use cases, direct binary file output (configured via--fake_opt output_path) is preferred. Base64 support may be re-evaluated for future versions if its utility does not justify the added complexity.)
Table of Contents
Installation
protoc-gen-fake can be installed in several ways, depending on your preferences and development setup.
From GitHub Releases (Recommended for most users)
The easiest way to get protoc-gen-fake is to download the pre-compiled binary for your operating system from the GitHub Releases page.
-
Download: Go to the Releases page and download the appropriate
.zipor.tar.gzfile for your system (e.g.,protoc-gen-fake-darwin-arm64for macOS Apple Silicon). -
Extract: Extract the downloaded archive.
-
Add to PATH: Move the
protoc-gen-fakeexecutable to a directory in your system's$PATH(e.g.,/usr/local/binor~/.cargo/bin).# Example for macOS/Linux mv /path/to/downloaded/protoc-gen-fake /usr/local/bin/Ensure the chosen directory is in your
$PATH. If not, you might need to add it to your shell's configuration file (e.g.,.bashrc,.zshrc, or.profile):export PATH="/usr/local/bin:$PATH"
Via Cargo (For Rust developers)
You can install protoc-gen-fake directly from the repository using Cargo:
cargo install --git https://github.com/mike-williamson/protoc-gen-fake.git
Or if you have cloned the repository locally:
cargo install --path .
This will compile the plugin from source and place the executable in ~/.cargo/bin/, which should already be in your $PATH.
From Source
If you prefer to build the plugin yourself:
-
Clone the repository:
git clone https://github.com/mike-williamson/protoc-gen-fake.git cd protoc-gen-fake -
Build in release mode:
cargo build --releaseThis will create an executable named
protoc-gen-fakein thetarget/release/directory. -
Copy the executable to your
$PATH:You can copy the binary to a directory that is already in your
$PATH, such as~/.cargo/bin/(or/usr/local/bin):cp target/release/protoc-gen-fake ~/.cargo/bin/
Via Homebrew (For macOS & Linux users)
You can install protoc-gen-fake directly from this repository by tapping it:
brew tap lazarillo/protoc-gen-fake https://github.com/lazarillo/protoc-gen-fake
brew install protoc-gen-fake
Once installed, you can invoke protoc-gen-fake directly via protoc.
Usage
CLI Utility Flags
While protoc-gen-fake is primarily used as a plugin, the executable also supports standard CLI flags for version checking and help:
--version/-V: Prints the current version of the plugin.--help/-h: Prints usage information.
protoc-gen-fake --version
protoc-gen-fake --help
Direct protoc Calls
When using protoc-gen-fake directly with protoc, you need to specify the output directory for the generated fake data. For binary output, protoc does not pass the output directory to the plugin, so you must provide it explicitly using the --fake_opt output_path parameter.
protoc --proto_path=proto \
--fake_out=. \
--fake_opt output_path=mike_data/full_customer.bin \
proto/examples/full_customer.proto
--proto_path=proto: Specifies the directory where your.protofiles are located.--fake_out=.: This tellsprotocto invokeprotoc-gen-fake. The value here (.) is largely ignored byprotoc-gen-fakefor binary output, but it's required byprotoc.--fake_opt output_path=mike_data/full_customer.bin: This is the crucial part for binary output. It tellsprotoc-gen-fakewhere to write the generated binary fake data file.proto/examples/full_customer.proto: The.protofile for which to generate fake data.
To generate Base64 encoded output directly in the CodeGeneratorResponse (which protoc then writes to stdout or a file if --fake_out points to a file), you can specify encoding=Base64:
protoc --proto_path=proto \
--fake_out=base64_output.txt \
--fake_opt encoding=Base64 \
proto/examples/full_customer.proto
gen_fake Wrapper Script
To simplify the command-line usage and avoid specifying the output path twice for binary output, you can use the provided gen_fake.sh wrapper script. This script handles the internal translation of a single --out_path argument into the necessary protoc and --fake_opt parameters.
./gen_fake.sh --out_path=mike_data/full_customer.bin proto/examples/full_customer.proto
--out_path=mike_data/full_customer.bin: Specifies the desired output path for the generated binary fake data.proto/examples/full_customer.proto: The.protofile for which to generate fake data.
Generating Language-Specific Protobuf Files
Before you can utilize the fake data in your application, you often need to generate language-specific protobuf classes from your .proto definitions. Here are examples for Python and Go.
Generating Python Code
protoc --proto_path=proto \
--python_out=examples \
proto/examples/full_customer.proto \
proto/gen_fake/fake_field.proto
--python_out=examples: Specifies the output directory for the generated Python protobuf files.
Generating Go Code
(Note: This requires protoc-gen-go to be installed and in your $PATH)
protoc --proto_path=proto \
--go_out=examples \
--go_opt=paths=source_relative \
proto/examples/full_customer.proto \
proto/gen_fake/fake_field.proto
--go_out=examples: Specifies the output directory for the generated Go protobuf files.--go_opt=paths=source_relative: Ensures that generated Go files have correct import paths.
Utilizing Generated Fake Data
Once you have generated both the fake data (binary) and the language-specific protobuf classes, you can load and use the fake data in your application.
Using Fake Data in Python
Assuming you have generated full_customer_pb2.py into the examples directory and full_customer.bin into mike_data/:
import sys
sys.path.append('examples') # Add the directory where protobuf files are generated
from full_customer_pb2 import FullCustomer
def load_fake_customer(file_path):
with open(file_path, 'rb') as f:
data = f.read()
customer = FullCustomer()
customer.ParseFromString(data)
return customer
if __name__ == '__main__':
fake_customer = load_fake_customer('mike_data/full_customer.bin')
print(fake_customer)
Using Fake Data in Go
Assuming you have generated full_customer.pb.go into the examples directory and full_customer.bin into mike_data/:
package main
import (
"fmt"
"io/ioutil"
"log"
"path/filepath"
"google.golang.org/protobuf/proto"
examples "your_module_path/examples" // Replace with your actual module path
)
func main() {
filePath := filepath.Join("mike_data", "full_customer.bin")
data, err := ioutil.ReadFile(filePath)
if err != nil {
log.Fatalf("Failed to read fake data file: %v", err)
}
customer := &examples.FullCustomer{} // Assuming FullCustomer is in your examples package
if err := proto.Unmarshal(data, customer); err != nil {
log.Fatalf("Failed to unmarshal fake data: %v", err)
}
fmt.Printf("Loaded Fake Customer: %+v\n", customer)
}
Configuration
protoc-gen-fake uses custom protobuf options to configure fake data generation for messages and fields.
Message-Level Options
option (gen_fake.fake_msg).include = true;-
Purpose: By default,
protoc-gen-fakewill only generate data for the first top-level message defined in a.protofile. To explicitly include a specific message for fake data generation (especially useful if you have multiple top-level messages and want to target one, or if it's a nested message you want to ensure is generated as a root object), set this option totrue. This effectively marks the message as an "entry point" for generation.message MyMessage { option (gen_fake.fake_msg).include = true; string id = 1 [(gen_fake.fake_data).data_type = "UUID"]; }
-
Field-Level Options ((gen_fake.fake_data) = {...})
These options are applied to individual fields within a message to control the type and characteristics of the fake data generated.
-
Enabling Default Fake Data:
-
[(gen_fake.fake_data) = {}]- Purpose: To opt-in a field for default fake data generation. If no specific fake data type (like
email,uuid, etc.) is provided, the plugin will attempt to generate a sensible default based on the field's protobuf type (e.g., random string forstring, random number forint32).
message User { string name = 1 [(gen_fake.fake_data) = {}]; // Generates a default fake string } - Purpose: To opt-in a field for default fake data generation. If no specific fake data type (like
-
-
Specifying Fake Data Types:
-
You can specify a wide range of fake data types using the options within
(gen_fake.fake_data). -
Example: Email and UUID
message User { string email = 1 [(gen_fake.fake_data).data_type = "SafeEmail"]; string id = 2 [(gen_fake.fake_data).data_type = "UUID"]; }
-
-
Controlling Optional Field Generation:
-
[(gen_fake.fake_data).min_count = 1]- Purpose: For optional fields, by default,
protoc-gen-fakemight or might not generate a value. Settingmin_count = 1ensures that the optional field will always be generated.
message Product { optional string description = 1 [(gen_fake.fake_data).min_count = 1]; // Always generate a description } - Purpose: For optional fields, by default,
-
-
Nested Messages:
-
protoc-gen-fakeautomatically recurses into nested messages. You apply field-level options to fields within the nested messages as usual.message Address { string street = 1 [(gen_fake.fake_data).data_type = "StreetName"]; string city = 2 [(gen_fake.fake_data).data_type = "CityName"]; } message Customer { string name = 1 [(gen_fake.fake_data).data_type = "Name"]; Address home_address = 2 [(gen_fake.fake_data) = {}]; // Generates a fake Address }In the example above,
home_addresswill have a fakeAddressgenerated for it, with itsstreetandcityfields populated according to their respectivefake_dataoptions.
-
Internationalization (i18n)
protoc-gen-fake supports generating localized fake data by specifying a language for specific fields using the language option.
message UserProfile {
string given_name = 1 [(gen_fake.fake_data).data_type = "FirstName", (gen_fake.fake_data).language = "ZH_TW"];
string address_street = 2 [(gen_fake.fake_data).data_type = "StreetName", (gen_fake.fake_data).language = "AR_SA"];
}
language = "ZH_TW": Generates a given name in Traditional Chinese.language = "AR_SA": Generates a street name in Arabic (Saudi Arabia).
Refer to the fake crate's documentation for a full list of supported locales and data providers.
Contributing
Contributions are welcome! Please feel free to open issues or submit pull requests.
Dependencies
~6.5–9.5MB
~170K SLoC