Skip to main content

End-User Rate Limit Policy

The end-user-rate-limit policy category enforces per-end-user (and per-group) request-rate caps. Each WaxellUser can carry a rate_limit_override_rpm; each WaxellUserGroup can carry one too. The tightest cap wins when a user belongs to multiple groups.

Counts the user's activity rows in the configured window -- includes signal_received, run_started, domain_call, and mcp_call. Use it to protect against runaway sub-users, noisy customers, or abusive third-party agent integrations.

Rules

RuleTypeDefaultDescription
enabledbooleantrueToggle enforcement on/off
action_on_exceedstringthrottleAction when over cap: throttle, block, warn
window_secondsinteger60Counting window for activity

The actual cap value comes from WaxellUser.rate_limit_override_rpm / WaxellUserGroup.rate_limit_override_rpm, not from the policy rules. The policy rules only configure what to do when caps are hit.

How It Works

The handler runs at before_workflow, mid_execution, and before_domain_call. after_workflow is a no-op.

Cap Resolution

caps = []
if user.rate_limit_override_rpm:
caps.append(user.rate_limit_override_rpm)
for group in user.groups:
if group.rate_limit_override_rpm:
caps.append(group.rate_limit_override_rpm)
effective_cap = min(caps) # tightest wins

If no caps are set on either the user or any of their groups, the handler returns ALLOW (no enforcement).

Cap is RPM, Window is Configurable

cap_rpm=60 with window_seconds=60 means 60 requests/minute. With window_seconds=30 it scales linearly to 30 requests/30s.

Context Attributes Read

AttributePhasePurpose
context.sub_user_identityallResolve sub_user_id
context.user_idallFallback resolver
context.metadata["tenant_id"] / context.tenant_idallScope the user lookup

Data Sources

  • WaxellUser.rate_limit_override_rpm
  • WaxellUserGroupMembership -> WaxellUserGroup.rate_limit_override_rpm
  • WaxellUserActivity rows where kind in (signal_received, run_started, domain_call, mcp_call)

Example Policy

{
"name": "Per-seat 60 RPM throttle",
"category": "end-user-rate-limit",
"rules": {
"enabled": true,
"action_on_exceed": "throttle",
"window_seconds": 60
},
"scope": {
"agents": ["*"]
},
"enabled": true
}

SDK Integration

import waxell_observe as waxell
waxell.init()

@waxell.observe(
agent_name="support-bot",
user_id="cust-9912",
user_group="free-tier",
enforce_policy=True,
)
async def handle_message(text: str) -> str:
return await respond(text)

Set caps via CLI:

wax end-users update cust-9912 --rate-limit-rpm 30
wax groups update free-tier --rate-limit-rpm 10

Observability

FieldExample (THROTTLE)
Categoryend-user-rate-limit
Actionthrottle
ReasonEnd-user 'cust-9912' rate-limited (12/10 in last 60s, cap=10/min).
Metadata{"sub_user_id": "cust-9912", "count": 12, "cap_rpm": 10, "window_seconds": 60}

Common Gotchas

  • Tightest cap wins across user + ALL groups. A user in two groups with caps of 100 and 30 gets the 30 cap.
  • No override set on user OR groups = ALLOW. This category needs at least one row with a non-null rate_limit_override_rpm to do anything.
  • Counts user-initiated activity only. Bookkeeping events (created, updated, suspended, hard_deleted) are excluded.
  • Default action is throttle, not block. Throttle lets the runtime back off and retry; block kills the run.
  • Window is a sliding window computed from timezone.now() - window_seconds. Not a fixed rolling bucket.
  • Observe plane uses Django ORM. Runtime plane wires a Redis-backed aggregator via the _usage_query_fn seam.

Next Steps