End-User Rate Limit Policy
The end-user-rate-limit policy category enforces per-end-user (and per-group) request-rate caps. Each WaxellUser can carry a rate_limit_override_rpm; each WaxellUserGroup can carry one too. The tightest cap wins when a user belongs to multiple groups.
Counts the user's activity rows in the configured window -- includes signal_received, run_started, domain_call, and mcp_call. Use it to protect against runaway sub-users, noisy customers, or abusive third-party agent integrations.
Rules
| Rule | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Toggle enforcement on/off |
action_on_exceed | string | throttle | Action when over cap: throttle, block, warn |
window_seconds | integer | 60 | Counting window for activity |
The actual cap value comes from WaxellUser.rate_limit_override_rpm / WaxellUserGroup.rate_limit_override_rpm, not from the policy rules. The policy rules only configure what to do when caps are hit.
How It Works
The handler runs at before_workflow, mid_execution, and before_domain_call. after_workflow is a no-op.
Cap Resolution
caps = []
if user.rate_limit_override_rpm:
caps.append(user.rate_limit_override_rpm)
for group in user.groups:
if group.rate_limit_override_rpm:
caps.append(group.rate_limit_override_rpm)
effective_cap = min(caps) # tightest wins
If no caps are set on either the user or any of their groups, the handler returns ALLOW (no enforcement).
Cap is RPM, Window is Configurable
cap_rpm=60 with window_seconds=60 means 60 requests/minute. With window_seconds=30 it scales linearly to 30 requests/30s.
Context Attributes Read
| Attribute | Phase | Purpose |
|---|---|---|
context.sub_user_identity | all | Resolve sub_user_id |
context.user_id | all | Fallback resolver |
context.metadata["tenant_id"] / context.tenant_id | all | Scope the user lookup |
Data Sources
WaxellUser.rate_limit_override_rpmWaxellUserGroupMembership->WaxellUserGroup.rate_limit_override_rpmWaxellUserActivityrows wherekind in (signal_received, run_started, domain_call, mcp_call)
Example Policy
{
"name": "Per-seat 60 RPM throttle",
"category": "end-user-rate-limit",
"rules": {
"enabled": true,
"action_on_exceed": "throttle",
"window_seconds": 60
},
"scope": {
"agents": ["*"]
},
"enabled": true
}
SDK Integration
import waxell_observe as waxell
waxell.init()
@waxell.observe(
agent_name="support-bot",
user_id="cust-9912",
user_group="free-tier",
enforce_policy=True,
)
async def handle_message(text: str) -> str:
return await respond(text)
Set caps via CLI:
wax end-users update cust-9912 --rate-limit-rpm 30
wax groups update free-tier --rate-limit-rpm 10
Observability
| Field | Example (THROTTLE) |
|---|---|
| Category | end-user-rate-limit |
| Action | throttle |
| Reason | End-user 'cust-9912' rate-limited (12/10 in last 60s, cap=10/min). |
| Metadata | {"sub_user_id": "cust-9912", "count": 12, "cap_rpm": 10, "window_seconds": 60} |
Common Gotchas
- Tightest cap wins across user + ALL groups. A user in two groups with caps of 100 and 30 gets the 30 cap.
- No override set on user OR groups = ALLOW. This category needs at least one row with a non-null
rate_limit_override_rpmto do anything. - Counts user-initiated activity only. Bookkeeping events (
created,updated,suspended,hard_deleted) are excluded. - Default action is
throttle, notblock. Throttle lets the runtime back off and retry; block kills the run. - Window is a sliding window computed from
timezone.now() - window_seconds. Not a fixed rolling bucket. - Observe plane uses Django ORM. Runtime plane wires a Redis-backed aggregator via the
_usage_query_fnseam.
Next Steps
- Rate Limit Policy -- Tenant-wide rate caps
- End-User Budget -- Per-user monthly spend cap
- End-User Suspension -- Block runs for suspended end-users
- User Tracking -- How sub-user identity flows
- Policy Categories & Templates -- All categories