4 releases
| 0.1.25 | Jan 1, 2026 |
|---|---|
| 0.1.23 | Dec 30, 2025 |
| 0.1.22 | Dec 30, 2025 |
| 0.1.0 | Nov 25, 2025 |
#154 in WebSocket
Used in 4 crates
1MB
14K
SLoC
mecha10-diagnostics
Topic-based diagnostics and performance monitoring service for Mecha10.
Overview
The diagnostics service provides comprehensive performance monitoring through the framework's pub/sub system. All diagnostics are published to topics under the /diagnostics namespace, allowing real-time monitoring via CLI, dashboard, or custom subscribers.
Features
- Topic-based architecture: Seamlessly integrates with Mecha10's pub/sub system
- Streaming pipeline metrics: Frame pipeline, latency, encoding performance, bandwidth
- WebRTC metrics: Connection stats, quality metrics (RTT, packet loss, jitter)
- WebSocket metrics: Connection tracking, message rates
- Redis metrics: Connection pool, operation latency and throughput
- Docker metrics: Container CPU, memory, network, and I/O stats
- System metrics: Host CPU, memory, disk, and network usage
- Low overhead: Atomic counters on hot paths, background aggregation
Diagnostic Topics
All diagnostics use the /diagnostics namespace:
/diagnostics/streaming/pipeline # Frame pipeline metrics
/diagnostics/streaming/latency # Latency measurements
/diagnostics/streaming/encoding # Encoding performance
/diagnostics/streaming/bandwidth # Bandwidth usage
/diagnostics/webrtc/connections # WebRTC peer connections
/diagnostics/webrtc/quality # RTT, packet loss, jitter
/diagnostics/websocket/connections # WebSocket connections
/diagnostics/websocket/messages # WebSocket message stats
/diagnostics/redis/pool # Redis connection pool
/diagnostics/redis/operations # Redis operation metrics
/diagnostics/docker/containers # Docker container stats
/diagnostics/godot/performance # Godot FPS, frame time, physics
/diagnostics/godot/scene # Godot node count, memory
/diagnostics/godot/connection # Godot WebSocket health
/diagnostics/system/resources # System-wide resources
/diagnostics/node/{id}/health # Per-node health
Usage
Publishing Streaming Diagnostics
use mecha10_diagnostics::prelude::*;
use mecha10_core::prelude::*;
// Create collector
let collector = StreamingCollector::new("simulation-bridge");
// Record metrics on hot path (minimal overhead)
collector.record_frame_received();
collector.record_frame_encoded(5000); // 5ms encode time
collector.record_frame_sent(10000); // 10KB frame
// Publish aggregated metrics periodically (every 1-5 seconds)
collector.publish_all(&ctx, 400_000).await?;
Subscribing to Diagnostics
use mecha10_diagnostics::prelude::*;
use mecha10_core::prelude::*;
// Subscribe to streaming pipeline metrics
let mut rx = ctx.subscribe(
Topic::<DiagnosticMessage<StreamingPipelineMetrics>>::new(
TOPIC_DIAGNOSTICS_STREAMING_PIPELINE
)
).await?;
while let Some(msg) = rx.recv().await {
println!("FPS: {}, Frames dropped: {}",
msg.payload.fps,
msg.payload.frames_dropped
);
}
Docker Metrics
// Create Docker collector
let docker = DockerCollector::new("diagnostics-service").await;
// Collect metrics for all containers
docker.collect_all_containers(&ctx).await?;
System Metrics
// Create system collector
let system = SystemCollector::new("diagnostics-service");
// Collect and publish system metrics
system.collect_metrics(&ctx).await?;
Godot Metrics
// Create Godot collector
let godot = GodotCollector::new(
"simulation-bridge",
"ws://localhost:11008".to_string(), // Control URL
"ws://localhost:11009".to_string(), // Camera URL
);
// Track connection events
godot.set_control_connected(true);
godot.set_camera_connected(true);
// Record messages
godot.record_control_message(now_micros());
godot.record_camera_frame(now_micros());
// Publish connection health
godot.publish_connection_metrics(&ctx).await?;
// Publish performance metrics (data from Godot)
godot.publish_performance_metrics(
&ctx,
60.0, // fps
60.0, // target_fps
16.6, // frame_time_ms
2.5, // physics_time_ms
10.0, // render_time_ms
4.1, // idle_time_ms
).await?;
CLI Integration
Monitor diagnostics in real-time:
# View all diagnostics
mecha10 diagnostics
# Filter by category
mecha10 diagnostics --category streaming
mecha10 diagnostics --category docker
# Live streaming metrics only
mecha10 diagnostics --streaming
# Historical query
mecha10 diagnostics --from "2024-01-01 12:00" --to "2024-01-01 13:00"
# Export to file
mecha10 diagnostics --export metrics.json
Dashboard Integration
The diagnostics service integrates with the dashboard via WebSocket:
- Real-time charts and visualizations
- Bottleneck detection
- Alerting on anomalies
- Historical analysis
Access at: http://localhost:3000/dashboard/diagnostics
Metric Types
Counter
Monotonically increasing value (e.g., frames received)
let counter = Counter::new();
counter.inc();
counter.add(5);
let total = counter.get();
Gauge
Current value that can go up or down (e.g., queue depth)
let gauge = Gauge::new();
gauge.set(10);
gauge.inc();
gauge.dec();
Histogram
Distribution tracking with percentiles (e.g., encoding latency)
let mut hist = Histogram::for_latency()?;
hist.record(5000); // Record 5ms
let p95 = hist.p95();
let p99 = hist.p99();
Performance
The diagnostics system is designed for minimal overhead:
- Atomic counters: Zero-cost increment on hot paths
- Background aggregation: Expensive operations (percentiles) run off hot path
- Rate limiting: Automatic 1-second intervals for publishing
- Zero-copy: Efficient Arc-based data sharing
Overhead on streaming pipeline: < 1% CPU, < 1MB memory
Architecture
┌─────────────────────────────────────────┐
│ Diagnostic Publishers │
│ (Simulation Bridge, Nodes, Services) │
└────────────────┬────────────────────────┘
│
▼
┌───────────────┐
│ Redis Pub/Sub │
└───────────────┘
│
┌────────┴────────┐
▼ ▼
┌────────┐ ┌──────────┐
│ CLI │ │Dashboard │
└────────┘ └──────────┘
│ │
▼ ▼
┌──────────────────────────────┐
│ Telemetry Service │
│ (Historical Archive) │
└──────────────────────────────┘
Future Enhancements
- Redis instrumentation: Add detailed operation tracking to mecha10-core::Context
- WebRTC stats: Extract metrics from WebRTC peer connections
- WebSocket tracking: Instrument WebSocket connections
- Custom metrics: Allow nodes to publish custom diagnostic metrics
- Alerting: Automatic alerts on threshold violations
- Profiling integration: Integration with performance profiling tools
License
MIT
Dependencies
~34–70MB
~1M SLoC