Spawn Limit Policy
The spawn-limit policy puts a tenant-level ceiling on concurrent ctx.spawn children. This is the operator's rail: the agent-level cap (declared by the author in guards: [{type: spawn_concurrent, limit: N}]) enforces per-parent-tree in the BudgetLedger. This handler is policy-driven, scoped via the standard policy scope filter (tenant / agent / workflow), and halts the spawn before any children dispatch.
Runs only on the runtime plane -- there is no observe-plane before_spawn call site.
Rules
| Rule | Type | Default | Description |
|---|---|---|---|
concurrent_spawn_limit | integer | 50 | Maximum concurrent spawned children for this scope |
action | string | "block" | Either block or warn when the cap would be breached |
How It Works
The spawn-limit handler fires from the before_spawn hook on PolicyGovernanceHook. The dispatcher (ProductionSpawnDispatcher) is responsible for incrementing the counter on enqueue_children and decrementing on on_child_complete. This handler only reads the counter and rejects on breach.
| Phase | What It Checks | Actions |
|---|---|---|
before_workflow | No-op (ALLOW) | ALLOW |
mid_execution | If context._pending_spawn is set, delegates to before_spawn; otherwise ALLOW | BLOCK / WARN on cap breach |
before_spawn | Reads Redis counter for (agent, workflow), checks current + total_children > limit | BLOCK or WARN |
after_workflow | No-op (ALLOW) | ALLOW |
Context Attributes Read
| Attribute | Phase | Purpose |
|---|---|---|
context.agent_name | before_spawn | Counter key narrowing |
context.workflow_name | before_spawn | Counter key narrowing |
context._pending_spawn | mid_execution | Set by PolicyGovernanceHook.before_spawn with {child_agent, total_children} |
context.run_id | before_spawn | Recorded on durable EnforcementEvent |
Counter Key
spawn_concurrent:{agent_name}:{workflow_name}
Auto-prefixed with tenant by TenantAwareRedis. The SpawnLimitHandler.build_counter_key() staticmethod is the canonical builder -- the dispatcher uses the same function to ensure key parity.
Example Policy
{
"name": "Research Fleet Concurrency Cap",
"category": "spawn-limit",
"rules": {
"concurrent_spawn_limit": 25,
"action": "block"
},
"scope": {
"agents": ["research-agent"],
"workflows": ["batch-research"]
},
"enabled": true
}
SDK Integration
import waxell_observe as waxell
waxell.init()
@waxell.observe(agent_name="research-agent", enforce_policy=True)
async def batch_research(topics: list[str]) -> list[str]:
# ctx.spawn calls trigger before_spawn — the policy reads the
# Redis counter and halts if this batch + in-flight would exceed
# concurrent_spawn_limit for (research-agent, batch-research).
return await ctx.spawn_many("worker-agent", topics)
Observability
| Field | Example |
|---|---|
| Category | spawn-limit (recorded as concurrency in EnforcementEvent) |
| Action | block |
| Reason | "concurrent spawn limit breached: 23 in flight + 5 requested would reach 28 (cap 25) for scope agent=research-agent workflow=batch-research" |
| Metadata | {"cap": 25, "current_in_flight": 23, "requested": 5, "would_be": 28, "counter_key": "spawn_concurrent:research-agent:batch-research", "workflow_name": "batch-research"} |
Every block / warn writes a durable EnforcementEvent (handler="spawn-limit", category="concurrency") so blocks-in-last-24h-by-agent is a cheap query in the admin surface.
Common Gotchas
supported_planes = ["runtime"]. This policy never fires on the observe plane. Agents instrumented purely viawaxell-observe(no governed runtime) cannot enforce it. Thebefore_spawncall site only exists in the runtime dispatcher.- The handler does NOT increment / decrement the counter. That is the dispatcher's job. If you swap dispatchers, ensure the new one also calls
incon enqueue anddecon completion using the same key builder. - Redis read failure fails open. If the Redis read raises, the handler logs a warning and returns ALLOW. This is intentional (a Redis outage should not block spawns) but means a degraded Redis cluster will silently disable the cap.
concurrent_spawn_limit <= 0disables the check. Setting it to 0 or negative is a no-op ALLOW, not "block all spawns". To block all spawns, set the limit to 1 and rely on the cap being breached by the first request.- Scope narrowing happens upstream. By the time the handler runs,
DynamicPolicyManagerhas already filtered to in-scope policies viaapplies_to. The counter still reads the per-(agent, workflow) key, so a tenant-wide policy with no scope filter still counts per-agent-per-workflow buckets, not a single tenant-total counter. mid_executionis the entry point in Phase 1.0. The hook stampscontext._pending_spawnthen callsmid_execution_detailed. Non-spawn mid_execution invocations (tool calls, LLM calls) return ALLOW immediately.
Next Steps
- Rate-Limit Policy -- Requests/second, complementary to concurrent-spawn caps
- Budget Policy -- Cap fan-out cost, not just concurrency
- Operations Policy -- Other runtime guardrails
- Policy Categories