- Updated: March 18, 2026
- 3 min read
Alerting Rules, Threshold Settings, and Incident‑Response for OpenClaw Rating API Edge Token‑Bucket Rate Limiter
Alerting, Thresholds, and Incident‑Response for the OpenClaw Rating API Edge Token‑Bucket Rate Limiter
Self‑hosting OpenClaw on UBOS gives you powerful AI‑assistant capabilities, but reliable operation depends on robust monitoring and rapid response to rate‑limiting events. This guide provides concrete alerting rules, recommended threshold values, and step‑by‑step incident‑response procedures for the OpenClaw Rating API Edge token‑bucket rate limiter.
Why Monitor the Token‑Bucket Rate Limiter?
The token‑bucket algorithm protects the Rating API from overload by limiting the number of requests per second per client. When the bucket empties, requests are throttled, which can cause latency spikes or service degradation if not detected early. Proactive alerts help ops teams intervene before users experience failures.
Key Metrics to Track
- Bucket Fill Rate (tokens/second) – Should match the configured limit (e.g., 100 req/s).
- Current Bucket Level – Number of tokens currently available.
- Throttle Count – Number of requests rejected due to an empty bucket.
- Average Request Latency – Increases when throttling occurs.
All these metrics are visualised in the OpenClaw Metrics Dashboard you created earlier.
Alerting Rules
# Alert if throttle count spikes
ALERT TokenBucketThrottleHigh
IF rate_limiter_throttle_count{service="openclaw"}[5m] > 50
FOR 2m
LABELS {severity="critical"}
ANNOTATIONS {
summary = "High throttle count on OpenClaw Rating API",
description = "Throttle count exceeded 50 in the last 5 minutes. Check load and consider scaling."
}
# Alert if bucket level stays low
ALERT TokenBucketLowLevel
IF rate_limiter_bucket_level{service="openclaw"}[5m] < 20
FOR 3m
LABELS {severity="warning"}
ANNOTATIONS {
summary = "Token bucket near depletion",
description = "Current bucket level is below 20 tokens for 3 minutes. Monitor traffic spikes."
}
# Alert on rising latency correlated with throttling
ALERT RatingAPILatencyHigh
IF avg_over_time(openclaw_request_latency_seconds{endpoint="/rating"}[5m]) > 2
AND rate_limiter_throttle_count{service="openclaw"}[5m] > 10
FOR 2m
LABELS {severity="critical"}
ANNOTATIONS {
summary = "Latency increase with throttling",
description = "Request latency >2s while throttling is observed. Investigate capacity or adjust limits."
}
Recommended Threshold Settings
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| Throttle Count (5‑min window) | 30 | 50 |
| Bucket Level | < 30 tokens | < 20 tokens |
| Request Latency | 1.5 s | 2 s |
Incident‑Response Procedure
- Acknowledge the alert in your incident‑management tool (e.g., PagerDuty, Opsgenie).
- Gather context – Review the Metrics Dashboard and the Observability guide for recent spikes, recent deployments, or configuration changes.
- Identify the cause:
- Sudden traffic surge? Check upstream services or external API usage.
- Mis‑configured limit? Verify the token‑bucket parameters in
openclaw-config.yaml. - Resource contention? Look at CPU/memory usage on the rating service pod.
- Mitigation steps:
- Temporarily increase the bucket size or refill rate if traffic is legitimate.
- Apply back‑pressure to callers (e.g., HTTP 429 with retry‑after header).
- Scale the rating service horizontally via UBOS
ubos scale openclaw‑rating --replicas=3.
- Post‑mortem – Document the root cause, actions taken, and any changes to limits or alert rules. Update the Metrics Dashboard or Observability guide if needed.
For a complete self‑hosting walkthrough, see the OpenClaw hosting guide.
By implementing these alerts and response steps, developers and operations teams can ensure a smooth, reliable experience for users of the OpenClaw Rating API Edge token‑bucket rate limiter.