Golden Signals
The four signals Google's SRE book identifies as sufficient to monitor any user-facing service.
Latency (p99)
99th-percentile response time — the slowest 1% of requests. High p99 often signals queue buildup, GC pauses, or slow-path code that doesn't show up in averages.
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
Traffic
Requests per second the service is handling right now. Rising traffic can predict latency increases before they appear.
sum(rate(http_requests_total[5m]))
Errors
Percentage of requests that returned a 5xx error. Even a small sustained error rate burns error budget quickly.
rate(http_requests_total{status=~"5.."}[5m])
/ rate(http_requests_total[5m]) * 100
Saturation
How loaded is the resource (CPU, memory, queue depth)? High saturation predicts latency degradation before errors appear. A service at 90% CPU is fragile — any traffic spike will push it over.
# Traefik preset — container CPU ratio from dockerstats
container_cpu_usage_seconds_total
Rule of thumb: alert when saturation exceeds 80%, page when it exceeds 90%.