12 releases (5 breaking)
new 0.13.0 | Mar 29, 2025 |
---|---|
0.11.4 | Mar 26, 2025 |
0.10.2 | Mar 15, 2025 |
0.9.0 | Mar 2, 2025 |
0.1.2 | Jan 25, 2025 |
#201 in Debugging
1,117 downloads per month
290KB
3.5K
SLoC
Lambda OTel Lite
The lambda-otel-lite
crate provides a lightweight, efficient OpenTelemetry implementation specifically designed for AWS Lambda environments. It features a custom span processor and internal extension mechanism that optimizes telemetry collection for Lambda's unique execution model.
By leveraging Lambda's execution lifecycle and providing multiple processing modes, this crate enables efficient telemetry collection with minimal impact on function latency. By default, it uses the otlp-stdout-span-exporter
to export spans to stdout for the serverless-otlp-forwarder project.
[!IMPORTANT] This crate is highly experimental and should not be used in production. Contributions are welcome.
Table of Contents
- Requirements
- Features
- Architecture and Modules
- Installation
- Quick Start
- Processing Modes
- Telemetry Configuration
- Event Extractors
- Environment Variables
- License
- See Also
Requirements
- Rust 1.70+
- AWS Lambda Rust Runtime
- OpenTelemetry packages (automatically included as dependencies)
Features
- Flexible Processing Modes: Support for synchronous, asynchronous, and custom export strategies
- Automatic Resource Detection: Automatic extraction of Lambda environment attributes
- Lambda Extension Integration: Built-in extension for efficient telemetry export
- Efficient Memory Usage: Fixed-size queue to prevent memory growth
- AWS Event Support: Automatic extraction of attributes from common AWS event types
- Flexible Context Propagation: Support for W3C Trace Context, AWS X-Ray, and custom propagators
Architecture and Modules
-
telemetry
: Core initialization and configuration- Main entry point via
init_telemetry
- Configures global tracer and span processors
- Returns a
TelemetryCompletionHandler
for span lifecycle management
- Main entry point via
-
processor
: Lambda-optimized span processor- Fixed-size queue implementation
- Multiple processing modes
- Coordinates with extension for async export
-
extension
: Lambda Extension implementation- Manages extension lifecycle and registration
- Handles span export coordination
- Implements graceful shutdown
-
resource
: Resource attribute management- Automatic Lambda attribute detection
- Environment-based configuration
- Custom attribute support
-
constants
: Centralized configuration constants- Environment variable names
- Default values
- Resource attribute keys
- Ensures consistency across the codebase
-
extractors
: Event processing- Built-in support for API Gateway and ALB events
- Extensible trait system for custom events
- W3C Trace Context and AWS X-Ray propagation
-
layer
: Tower middleware integration- Best for complex services with middleware chains
- Integrates with Tower's service ecosystem
- Standardized instrumentation across services
-
handler
: Direct function wrapper- Provides
create_traced_handler
function to wrap Lambda handlers - Automatically tracks cold starts using the
faas.cold_start
attribute - Extracts and propagates trace context from event carriers
- Manages span lifecycle with automatic status handling for HTTP responses
- Records exceptions in spans with appropriate status codes
- Properly completes telemetry processing on handler completion
- Supports reuse of handler functions with efficient boxing strategy
- Provides
Installation
Add the crate to your project:
cargo add lambda-otel-lite
Quick Start
use aws_lambda_events::apigw::{ApiGatewayV2httpRequest, ApiGatewayV2httpResponse};
use aws_lambda_events::encodings::Body;
use http::header::HeaderMap;
use lambda_otel_lite::{create_traced_handler, init_telemetry, TelemetryConfig};
use lambda_runtime::{service_fn, Error, LambdaEvent, Runtime};
use opentelemetry::KeyValue;
use serde_json::{json, Value};
use std::collections::HashMap;
use tracing_opentelemetry::OpenTelemetrySpanExt;
// Business logic function
async fn process_user(user_id: &str) -> Result<Value, Error> {
// Your business logic here
Ok(json!({
"name": "User Name",
"id": user_id
}))
}
#[tokio::main]
async fn main() -> Result<(), Error> {
// Initialize telemetry with default configuration
let (tracer, completion_handler) = init_telemetry(TelemetryConfig::default()).await?;
// Create a traced handler function
let handler = create_traced_handler(
"my-api-handler",
completion_handler,
handler_function
);
// Run the Lambda runtime with our handler
Runtime::new(service_fn(handler)).run().await
}
// Define the handler function
async fn handler_function(event: LambdaEvent<ApiGatewayV2httpRequest>) -> Result<ApiGatewayV2httpResponse, Error> {
// Extract current span and add custom attributes
let span = tracing::Span::current();
span.set_attribute("handler.version", "1.0");
// Extract request information
let request = event.payload;
let context = event.context;
// Extract userId from path parameters
let user_id = request
.path_parameters
.get("userId")
.cloned()
.unwrap_or_else(|| "unknown".to_string());
// Add user ID to span
let span = tracing::Span::current();
span.set_attribute("user.id", user_id.clone());
// Process the request
let response = match process_user(&user_id).await {
Ok(user) => {
// Create success response
let body = json!({
"success": true,
"data": user
}).to_string();
ApiGatewayV2httpResponse {
status_code: 200,
headers: HeaderMap::new(),
body: Some(body.into()),
..Default::default()
}
},
Err(error) => {
// Simple error handling
let body = json!({
"success": false,
"error": "Internal server error"
}).to_string();
ApiGatewayV2httpResponse {
status_code: 500,
headers: HeaderMap::new(),
body: Some(body.into()),
..Default::default()
}
}
};
Ok(response)
}
Processing Modes
The crate supports three processing modes for span export:
-
Sync Mode (default):
- Direct, synchronous export in handler thread
- Recommended for:
- low-volume telemetry
- limited resources (memory, cpu)
- when latency is not critical
- Set via
LAMBDA_EXTENSION_SPAN_PROCESSOR_MODE=sync
-
Async Mode:
- Export via Lambda extension using AWS Lambda Extensions API
- Spans are queued and exported after handler completion
- Uses channel-based communication between handler and extension
- Registers specifically for Lambda INVOKE events
- Implements graceful shutdown with SIGTERM handling
- Error handling for:
- Channel communication failures
- Export failures
- Extension registration issues
- Best for production use with high telemetry volume
- Set via
LAMBDA_EXTENSION_SPAN_PROCESSOR_MODE=async
-
Finalize Mode:
- Registers extension with no events
- Maintains SIGTERM handler for graceful shutdown
- Ensures all spans are flushed during shutdown
- Compatible with BatchSpanProcessor for custom export strategies
- Best for specialized export requirements where you need full control
- Set via
LAMBDA_EXTENSION_SPAN_PROCESSOR_MODE=finalize
Async Processing Mode Architecture
The async mode leverages Lambda's extension API to optimize perceived latency by deferring span export until after the response is sent to the user:
sequenceDiagram
participant Lambda Runtime
participant Extension Thread
participant Handler
participant LambdaSpanProcessor
participant OTLPStdoutSpanExporter
Note over Extension Thread: Initialization
Extension Thread->>Lambda Runtime: Register extension (POST /register)
Lambda Runtime-->>Extension Thread: Extension ID
Extension Thread->>Lambda Runtime: Get next event (GET /next)
Note over Handler: Function Invocation
Handler->>LambdaSpanProcessor: Create & queue spans during execution
Note over LambdaSpanProcessor: Spans stored in fixed-size queue
Handler->>Extension Thread: Send completion signal
Note over Handler: Handler returns response
Extension Thread->>LambdaSpanProcessor: Flush spans
LambdaSpanProcessor->>OTLPStdoutSpanExporter: Export batched spans
Extension Thread->>Lambda Runtime: Get next event (GET /next)
Note over Extension Thread: On SIGTERM
Lambda Runtime->>Extension Thread: SHUTDOWN event
Extension Thread->>LambdaSpanProcessor: Force flush remaining spans
LambdaSpanProcessor->>OTLPStdoutSpanExporter: Export remaining spans
The async mode leverages Lambda's extension API to optimize perceived latency by deferring span export until after the response is sent to the user. The diagram above shows the core coordination between components:
- Extension thread registers with the Lambda Runtime and awaits events
- Handler creates and queues spans during execution via LambdaSpanProcessor
- Handler signals completion to the extension thread before returning
- Extension thread processes and exports queued spans after handler completes
- Extension thread returns to waiting for the next event
- On shutdown (SIGTERM), remaining spans are flushed and exported
Telemetry Configuration
The crate provides several ways to configure the open telemetry tracing pipeline, which is a required first step to instrument your lambda function:
Custom configuration with custom resource attributes:
use lambda_otel_lite::{init_telemetry, TelemetryConfig};
use opentelemetry::KeyValue;
use opentelemetry_sdk::Resource;
use lambda_runtime::Error;
#[tokio::main]
async fn main() -> Result<(), Error> {
let resource = Resource::builder()
.with_attributes(vec![
KeyValue::new("service.version", "1.0.0"),
KeyValue::new("deployment.environment", "production"),
])
.build();
let config = TelemetryConfig::builder()
.resource(resource)
.build();
let (_, completion_handler) = init_telemetry(config).await?;
// Use the tracer and completion handler as usual
Ok(())
}
Custom configuration with custom span processors:
use lambda_otel_lite::{init_telemetry, TelemetryConfig};
use opentelemetry_sdk::trace::SimpleSpanProcessor;
use otlp_stdout_span_exporter::OtlpStdoutSpanExporter;
use lambda_runtime::Error;
#[tokio::main]
async fn main() -> Result<(), Error> {
let config = TelemetryConfig::builder()
.with_span_processor(SimpleSpanProcessor::new(
Box::new(OtlpStdoutSpanExporter::default())
))
.enable_fmt_layer(true)
.build();
let (_, completion_handler) = init_telemetry(config).await?;
Ok(())
}
Note that the .with_span_processor
method accepts a SpanProcessor
trait object, so you can pass in any type that implements the SpanProcessor
trait, and can be called multiple times. The order of the processors is the order of the calls to .with_span_processor
.
Custom configuration with context propagators:
use lambda_otel_lite::{init_telemetry, TelemetryConfig, propagation::LambdaXrayPropagator};
use opentelemetry_sdk::propagation::{BaggagePropagator, TraceContextPropagator};
use opentelemetry_aws::trace::XrayPropagator;
use lambda_runtime::Error;
#[tokio::main]
async fn main() -> Result<(), Error> {
let config = TelemetryConfig::builder()
// Add W3C Trace Context propagator (default)
.with_propagator(TraceContextPropagator::new())
// Add AWS X-Ray propagator
.with_propagator(XrayPropagator::new())
// Add Lambda-enhanced X-Ray propagator (with _X_AMZN_TRACE_ID environment variable support)
.with_propagator(LambdaXrayPropagator::new())
// Add W3C Baggage propagator
.with_propagator(BaggagePropagator::new())
.build();
let (_, completion_handler) = init_telemetry(config).await?;
// Use the tracer and completion handler as usual
Ok(())
}
By default, the crate combines two propagators: W3C Trace Context (TraceContextPropagator
) and the Lambda-specific X-Ray propagator (LambdaXrayPropagator
), providing out-of-the-box support for both industry-standard tracing and AWS-specific tracing. You can add additional propagators using the with_propagator
method, or use with_named_propagator
with the following options:
"tracecontext"
: W3C Trace Context propagator"xray"
: Standard AWS X-Ray propagator"xray-lambda"
: Enhanced X-Ray propagator with Lambda environment variable support"none"
: No propagation (disables context propagation)
Multiple propagators are combined into a composite propagator that can handle various trace context formats.
Custom configuration with ID generator:
use lambda_otel_lite::{init_telemetry, TelemetryConfig};
use opentelemetry_aws::trace::XrayIdGenerator;
use lambda_runtime::Error;
#[tokio::main]
async fn main() -> Result<(), Error> {
let config = TelemetryConfig::builder()
// Use AWS X-Ray compatible ID generator for trace and span IDs
.with_id_generator(XrayIdGenerator::default())
.build();
let (_, completion_handler) = init_telemetry(config).await?;
// Use the tracer and completion handler as usual
Ok(())
}
By default, OpenTelemetry uses a random ID generator that creates W3C-compatible trace and span IDs. The with_id_generator
method allows you to customize the ID generation strategy. This is particularly useful when integrating with AWS X-Ray, which requires a specific ID format.
To use the X-Ray ID generator, you'll need to add the opentelemetry-aws
crate to your dependencies:
[dependencies]
opentelemetry-aws = "0.16.0"
The XrayIdGenerator formats trace IDs in a way that's compatible with AWS X-Ray, using a timestamp in the first part of the trace ID. This allows X-Ray to display and organize traces correctly, and enables correlation between OpenTelemetry traces and traces from other services that use X-Ray.
Using the Tower Layer
You can "wrap" your handler in the OtelTracingLayer
using the ServiceBuilder
from the tower
crate:
use lambda_otel_lite::{init_telemetry, TelemetryConfig, OtelTracingLayer};
use lambda_runtime::{service_fn, Error, LambdaEvent, Runtime};
use lambda_runtime::tower::ServiceBuilder;
use aws_lambda_events::event::apigw::ApiGatewayV2httpRequest;
use serde_json::Value;
async fn handler(event: LambdaEvent<ApiGatewayV2httpRequest>) -> Result<Value, Error> {
Ok(serde_json::json!({
"statusCode": 200,
"body": format!("Hello from request {}", event.context.request_id)
}))
}
#[tokio::main]
async fn main() -> Result<(), Error> {
// Initialize telemetry with default configuration
let (_, completion_handler) = init_telemetry(TelemetryConfig::default()).await?;
// Build service with OpenTelemetry tracing middleware
let service = ServiceBuilder::new()
.layer(OtelTracingLayer::new(completion_handler).with_name("tower-handler"))
.service_fn(handler);
// Create and run the Lambda runtime
Runtime::new(service).run().await
}
Using the handler wrapper function
Or, you can use the create_traced_handler
function to wrap your handler:
use lambda_otel_lite::{init_telemetry, TelemetryConfig, create_traced_handler};
use lambda_runtime::{service_fn, Error, LambdaEvent, Runtime};
use aws_lambda_events::event::apigw::ApiGatewayV2httpRequest;
use serde_json::Value;
async fn handler(event: LambdaEvent<ApiGatewayV2httpRequest>) -> Result<Value, Error> {
Ok(serde_json::json!({ "statusCode": 200 }))
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let (_, completion_handler) = init_telemetry(TelemetryConfig::default()).await?;
let handler = create_traced_handler(
"my-handler",
completion_handler,
handler
);
Runtime::new(service_fn(handler)).run().await
}
Library specific Resource Attributes
The crate adds several resource attributes under the lambda_otel_lite
namespace to provide configuration visibility:
lambda_otel_lite.extension.span_processor_mode
: Current processing mode (sync
,async
, orfinalize
)lambda_otel_lite.lambda_span_processor.queue_size
: Maximum number of spans that can be queuedlambda_otel_lite.lambda_span_processor.batch_size
: Maximum batch size for span exportlambda_otel_lite.otlp_stdout_span_exporter.compression_level
: GZIP compression level used for span export
These attributes are automatically added to the resource and can be used to understand the telemetry configuration in your observability backend.
Event Extractors
Event extractors are responsible for extracting span attributes and context from Lambda event and context objects. The crate provides built-in extractors for common Lambda triggers.
Automatic Attributes extraction
The crate automatically sets relevant FAAS attributes based on the Lambda context and event:
-
Resource Attributes (set at initialization):
cloud.provider
: "aws"cloud.region
: from AWS_REGIONfaas.name
: from AWS_LAMBDA_FUNCTION_NAMEfaas.version
: from AWS_LAMBDA_FUNCTION_VERSIONfaas.instance
: from AWS_LAMBDA_LOG_STREAM_NAMEfaas.max_memory
: from AWS_LAMBDA_FUNCTION_MEMORY_SIZEservice.name
: from OTEL_SERVICE_NAME (defaults to function name)- Additional attributes from OTEL_RESOURCE_ATTRIBUTES
-
Span Attributes (set per invocation):
faas.cold_start
: true on first invocationcloud.account.id
: extracted from context's invokedFunctionArnfaas.invocation_id
: from awsRequestIdcloud.resource_id
: from context's invokedFunctionArn
-
HTTP Attributes (set for API Gateway events):
faas.trigger
: "http"http.status_code
: from handler responsehttp.route
: from routeKey (v2) or resource (v1)http.method
: from requestContext (v2) or httpMethod (v1)http.target
: from pathhttp.scheme
: from protocol
The crate automatically detects API Gateway v1 and v2 events and sets the appropriate HTTP attributes. For HTTP responses, the status code is automatically extracted from the handler's response and set as http.status_code
. For 5xx responses, the span status is set to ERROR.
Built-in Extractors
The crate provides built-in support for extracting span attributes from common AWS event types:
- API Gateway REST API (v1)
- API Gateway HTTP API (v2)
- Application Load Balancer (ALB)
Each extractor is designed to handle a specific event type and extract relevant attributes, including trace context propagation from HTTP headers (both W3C Trace Context and AWS X-Ray formats).
Custom Extractors
For other events than the ones directly supported by the crate, you can implement the SpanAttributesExtractor
trait for your own event types:
use lambda_otel_lite::{init_telemetry, TelemetryConfig, create_traced_handler, SpanAttributes, SpanAttributesExtractor};
use lambda_runtime::{service_fn, Error, LambdaEvent, Runtime};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use opentelemetry::Value;
use serde_json::Value as JsonValue;
// Define a custom event type
#[derive(Clone, Deserialize, Serialize)]
struct MyEvent {
user_id: String,
trace_parent: Option<String>,
xray_trace_id: Option<String>,
}
// Implement SpanAttributesExtractor for the custom event
impl SpanAttributesExtractor for MyEvent {
fn extract_span_attributes(&self) -> SpanAttributes {
let mut attributes = HashMap::new();
attributes.insert("user.id".to_string(), Value::String(self.user_id.clone().into()));
// Add trace context if available
let mut carrier = HashMap::new();
// Add W3C Trace Context header
if let Some(header) = &self.trace_parent {
carrier.insert("traceparent".to_string(), header.clone());
}
// Add X-Ray trace header
if let Some(header) = &self.xray_trace_id {
carrier.insert("x-amzn-trace-id".to_string(), header.clone());
}
SpanAttributes::builder()
.attributes(attributes)
.carrier(carrier)
.build()
}
}
async fn handler(event: LambdaEvent<MyEvent>) -> Result<JsonValue, Error> {
Ok(serde_json::json!({
"statusCode": 200,
"body": format!("Hello, user {}", event.payload.user_id)
}))
}
#[tokio::main]
async fn main() -> Result<(), Error> {
let config = TelemetryConfig::default();
let (_, completion_handler) = init_telemetry(config).await?;
let handler = create_traced_handler(
"my-handler",
completion_handler,
handler
);
Runtime::new(service_fn(handler)).run().await
}
The SpanAttributes
object returned by the extractor contains:
attributes
: A map of attributes to add to the spancarrier
: Optional map containing trace context headers for propagation (supports both W3C and X-Ray formats)span_name
: Optional custom name for the span (defaults to handler name)
Handling Standard AWS Lambda Events
For standard AWS Lambda event types provided by the aws-lambda-events
crate that don't have built-in extractors, you can create a newtype wrapper and implement the SpanAttributesExtractor
trait for it. This approach is necessary due to Rust's orphan rule, which prevents implementing external traits for external types directly.
Here's an example for Kinesis events:
use aws_lambda_events::event::kinesis::KinesisEvent;
use lambda_otel_lite::{init_telemetry, TelemetryConfig, create_traced_handler, SpanAttributes, SpanAttributesExtractor};
use lambda_runtime::{service_fn, Error, LambdaEvent, Runtime};
use opentelemetry::Value;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
// Create a newtype wrapper around KinesisEvent
#[derive(Clone, Debug, Serialize, Deserialize)]
struct KinesisEventWrapper(pub KinesisEvent);
// Implement SpanAttributesExtractor for the wrapper
impl SpanAttributesExtractor for KinesisEventWrapper {
fn extract_span_attributes(&self) -> SpanAttributes {
let mut attributes: HashMap<String, Value> = HashMap::new();
let records = &self.0.records;
// Add attributes from the Kinesis event
attributes.insert(
"forwarder.events.count".to_string(),
Value::I64(records.len() as i64),
);
// Extract stream name from the first record if available
if let Some(first_record) = records.first() {
if let Some(event_source) = &first_record.event_source {
attributes.insert(
"forwarder.stream.name".to_string(),
Value::String(event_source.clone().into()),
);
}
}
SpanAttributes::builder()
.span_name("kinesis-processor".to_string())
.attributes(attributes)
.build()
}
}
// Handler function that uses the wrapper
async fn function_handler(
event: LambdaEvent<KinesisEventWrapper>,
) -> Result<(), Error> {
// Process Kinesis records
let records = &event.payload.0.records;
// Your processing logic here
Ok(())
}
#[tokio::main]
async fn main() -> Result<(), Error> {
// Initialize telemetry
let (_, completion_handler) = init_telemetry(TelemetryConfig::default()).await?;
// Create traced handler with the wrapper
let handler = create_traced_handler(
"kinesis-processor",
completion_handler,
function_handler
);
// Run the Lambda runtime
Runtime::new(service_fn(handler)).run().await
}
This pattern can be applied to any event type from the aws-lambda-events
crate, such as:
- SQS events
- SNS events
- DynamoDB events
- S3 events
- CloudWatch events
- And more
By creating a newtype wrapper, you can add custom span attributes specific to each event type while maintaining type safety and satisfying Rust's orphan rule.
Environment Variables
The library uses environment variables for configuration, with a clear precedence order:
- Environment variables (highest precedence)
- Constructor parameters
- Default values (lowest precedence)
Processing Configuration
LAMBDA_EXTENSION_SPAN_PROCESSOR_MODE
: Controls processing mode"sync"
for Sync mode (default)"async"
for Async mode"finalize"
for Finalize mode
LAMBDA_SPAN_PROCESSOR_QUEUE_SIZE
: Maximum spans to queue (default: 2048)LAMBDA_SPAN_PROCESSOR_BATCH_SIZE
: Maximum batch size (default: 512)
You can also set the processor mode programmatically through the TelemetryConfig
:
use lambda_otel_lite::{init_telemetry, TelemetryConfig, ProcessorMode};
use lambda_runtime::Error;
#[tokio::main]
async fn main() -> Result<(), Error> {
let config = TelemetryConfig::builder()
.processor_mode(ProcessorMode::Async)
.build();
let (_, completion_handler) = init_telemetry(config).await?;
// Use the tracer and completion handler as usual
Ok(())
}
Note that the environment variable LAMBDA_EXTENSION_SPAN_PROCESSOR_MODE
will always take precedence over the programmatic setting if both are specified.
Resource Configuration
OTEL_SERVICE_NAME
: Service name for spans (falls back toAWS_LAMBDA_FUNCTION_NAME
)OTEL_RESOURCE_ATTRIBUTES
: Additional resource attributes in format:key=value,key2=value2
Resource attributes from environment variables are only included in the resource when the environment variable is explicitly set. This ensures that the reported resource attributes accurately reflect the actual configuration used.
Export Configuration
OTLP_STDOUT_SPAN_EXPORTER_COMPRESSION_LEVEL
: GZIP compression level (0-9, default: 6)
Logging and Debug
RUST_LOG
orAWS_LAMBDA_LOG_LEVEL
: Configure log levelsAWS_LAMBDA_LOG_FORMAT
: Set to "JSON" for JSON formatted logsLAMBDA_TRACING_ENABLE_FMT_LAYER
: Enable console output of spans for debugging (default: false)- Takes precedence over code configuration when set
- Setting to "true" enables console output even if disabled in code
- Setting to "false" disables console output even if enabled in code
- Only accepts exact string values "true" or "false" (case-insensitive)
- Invalid values will log a warning and fall back to code configuration
License
This project is licensed under the MIT License - see the LICENSE file for details.
See Also
Dependencies
~18–29MB
~430K SLoC