End-User Rate Limit Policy

The end-user-rate-limit policy category enforces per-end-user (and per-group) request-rate caps. Each WaxellUser can carry a rate_limit_override_rpm; each WaxellUserGroup can carry one too. The tightest cap wins when a user belongs to multiple groups.

Counts the user's activity rows in the configured window -- includes signal_received, run_started, domain_call, and mcp_call. Use it to protect against runaway sub-users, noisy customers, or abusive third-party agent integrations.

Rules

Rule	Type	Default	Description
`enabled`	boolean	`true`	Toggle enforcement on/off
`action_on_exceed`	string	`throttle`	Action when over cap: `throttle`, `block`, `warn`
`window_seconds`	integer	`60`	Counting window for activity

The actual cap value comes from WaxellUser.rate_limit_override_rpm / WaxellUserGroup.rate_limit_override_rpm, not from the policy rules. The policy rules only configure what to do when caps are hit.

How It Works

The handler runs at before_workflow, mid_execution, and before_domain_call. after_workflow is a no-op.

Cap Resolution

caps = []
if user.rate_limit_override_rpm:
    caps.append(user.rate_limit_override_rpm)
for group in user.groups:
    if group.rate_limit_override_rpm:
        caps.append(group.rate_limit_override_rpm)
effective_cap = min(caps)  # tightest wins

If no caps are set on either the user or any of their groups, the handler returns ALLOW (no enforcement).

Cap is RPM, Window is Configurable

cap_rpm=60 with window_seconds=60 means 60 requests/minute. With window_seconds=30 it scales linearly to 30 requests/30s.

Context Attributes Read

Attribute	Phase	Purpose
`context.sub_user_identity`	all	Resolve `sub_user_id`
`context.user_id`	all	Fallback resolver
`context.metadata["tenant_id"]` / `context.tenant_id`	all	Scope the user lookup

Data Sources

WaxellUser.rate_limit_override_rpm
WaxellUserGroupMembership -> WaxellUserGroup.rate_limit_override_rpm
WaxellUserActivity rows where kind in (signal_received, run_started, domain_call, mcp_call)

Example Policy

{
  "name": "Per-seat 60 RPM throttle",
  "category": "end-user-rate-limit",
  "rules": {
    "enabled": true,
    "action_on_exceed": "throttle",
    "window_seconds": 60
  },
  "scope": {
    "agents": ["*"]
  },
  "enabled": true
}

SDK Integration

import waxell_observe as waxell
waxell.init()

@waxell.observe(
    agent_name="support-bot",
    user_id="cust-9912",
    user_group="free-tier",
    enforce_policy=True,
)
async def handle_message(text: str) -> str:
    return await respond(text)

Set caps via CLI:

wax end-users update cust-9912 --rate-limit-rpm 30
wax groups update free-tier --rate-limit-rpm 10

Observability

Field	Example (THROTTLE)
Category	`end-user-rate-limit`
Action	`throttle`
Reason	`End-user 'cust-9912' rate-limited (12/10 in last 60s, cap=10/min).`
Metadata	`{"sub_user_id": "cust-9912", "count": 12, "cap_rpm": 10, "window_seconds": 60}`

Common Gotchas

Tightest cap wins across user + ALL groups. A user in two groups with caps of 100 and 30 gets the 30 cap.
No override set on user OR groups = ALLOW. This category needs at least one row with a non-null rate_limit_override_rpm to do anything.
Counts user-initiated activity only. Bookkeeping events (created, updated, suspended, hard_deleted) are excluded.
Default action is throttle, not block. Throttle lets the runtime back off and retry; block kills the run.
Window is a sliding window computed from timezone.now() - window_seconds. Not a fixed rolling bucket.
Observe plane uses Django ORM. Runtime plane wires a Redis-backed aggregator via the _usage_query_fn seam.

Next Steps

Rate Limit Policy -- Tenant-wide rate caps
End-User Budget -- Per-user monthly spend cap
End-User Suspension -- Block runs for suspended end-users
User Tracking -- How sub-user identity flows
Policy Categories & Templates -- All categories

Rules​

How It Works​

Cap Resolution​

Cap is RPM, Window is Configurable​

Context Attributes Read​

Data Sources​

Example Policy​

SDK Integration​

Observability​

Common Gotchas​

Next Steps​