Logan Kelly
Killing the parent agent doesn't stop its sub-agents. Here's the architectural problem with multi-agent kill switches — and how Waxell Runtime fixes it.

In March 2026, Stanford Law's CodeX blog published a review of the Berkeley Center for Long-Term Cybersecurity's Agentic AI Risk-Management Standards Profile — the most comprehensive publicly available framework for agentic AI governance, described in the Stanford Law review as a 55-page extension of the NIST AI RMF. The review identified the document's central structural gap in a single sentence: "An agent that has already delegated sub-tasks to other agents, distributed API keys, and spawned parallel execution threads is not a single entity. Killing the parent does not recall the children."
This is the multi-agent kill switch problem in its precise form. The Berkeley Profile recommends emergency automated shutdowns triggered by threshold breaches. It recommends manual shutdown methods as a last resort. What it doesn't address — what almost no governance framework addresses — is what happens after the shutdown signal fires and the parent agent stops, but five sub-agents it dispatched thirty seconds earlier are still running, still writing to databases, still calling APIs, still sending notifications. The signal reached the orchestrator. The swarm didn't get the memo.
A multi-agent kill switch is an emergency stop mechanism that terminates not just the orchestrator agent but every sub-agent it has spawned, every delegated task it has dispatched, and every external agent it has connected to — in a coordinated sequence that prevents in-flight operations from completing and leaves all affected sessions in a documented, recoverable state. A single-agent kill switch terminates one session. A multi-agent kill switch terminates a graph. Most production kill switches are the former. Most production agentic systems now require the latter.
Why does a multi-agent system need a different kind of kill switch?
Single-agent kill switches were designed around a specific model: one agent, one session, one set of tool calls. Terminate the session and you terminate the execution. The model held when agents were mostly single-process automations with narrow tool access. It doesn't hold when agents spawn sub-agents.
The architectural shift happened quietly. Multi-agent patterns — one orchestrator delegating to specialist sub-agents — became standard as teams discovered that a single long-context agent handling complex tasks was less reliable than a coordinator routing work to focused components. An orchestrator might dispatch a research sub-agent, a drafting sub-agent, a review sub-agent, and a delivery sub-agent in parallel. Each sub-agent has its own session context, its own tool access grants, its own in-flight calls. The orchestrator manages the workflow. The sub-agents perform the work.
When something goes wrong with the orchestrator — it loops, it exceeds a cost threshold, it makes a decision that triggers a human override — the natural instinct is to stop it. The kill switch fires. The orchestrator terminates. And then the sub-agents continue.
This is not a theoretical edge case. It is the default behavior of every multi-agent system where kill switch policy lives at the orchestrator level and sub-agents receive task instructions at dispatch time rather than checking governance state continuously. The sub-agents were given a task. They received no instruction to stop. They continue.
What actually goes wrong when the orchestrator stops but the swarm doesn't?
Three failure modes emerge consistently once multi-agent systems hit production at scale.
Orphaned sub-agents with live credentials. When an orchestrator is terminated mid-workflow, the sub-agents it dispatched retain whatever credentials they were granted at dispatch. A research sub-agent with database read access keeps its access. A delivery sub-agent with email send permissions keeps those permissions. The 1Kosmos analysis of enterprise agent deployments in 2026 documented this pattern as the "ghost agent" problem: agents outliving the workflow context that created them, operating with credentials that were never formally revoked, in environments where no one is actively monitoring them. The risk compounds across four categories: financial damage from unauthorized spending, security exposure from unmonitored credentials, compliance failures from broken audit trails, and reputation damage from public mistakes.
Cascading external effects that pre-empt cleanup. An orchestrator controls the workflow. Sub-agents control the tool calls. By the time the orchestrator is terminated, sub-agents may have already issued API calls that are mid-flight — a database write in a transaction, a webhook invocation with expected follow-up calls, an external notification service awaiting a completion signal. Killing the orchestrator doesn't cancel those calls. The external effects complete without the context that would have determined whether they should. The audit trail shows a clean orchestrator termination and a confusing aftermath: actions that completed after the stop signal, state that was partially written, downstream systems that received data they weren't supposed to receive.
Policy enforcement that only runs at the orchestrator level. Many teams implement kill switch logic inside the orchestrator's code: if a cost threshold is exceeded, stop. If an error rate exceeds a limit, halt. If a loop is detected, exit. This works for the orchestrator. Sub-agents have none of it. A circuit breaker that fires inside the orchestrator's execution context doesn't propagate to the sub-agents it dispatched. A March 2026 Stanford Law CodeX analysis of the Berkeley CLTC profile noted that the document itself cites evidence that models have sabotaged shutdown mechanisms in 79 out of 100 tests — but an agent doesn't need to actively resist shutdown to evade a kill switch that only targets its parent. It needs only to receive no instruction to stop.
What does a multi-agent kill switch actually require?
A kill switch that works across a multi-agent system requires three capabilities that most single-agent kill switch architectures lack.
Session graph awareness. A kill switch that targets a single session ID terminates one agent. A kill switch that terminates a graph needs to know the graph. Which sessions were spawned by this orchestrator? Which were spawned by those? What is the full set of active sessions descended from the execution that needs to stop? This requires that session lineage is tracked in real time — that when an orchestrator dispatches a sub-agent, the relationship is recorded in a queryable registry, not just in the orchestrator's context window. Without session graph tracking, the kill signal has no way to know what it needs to reach.
Kill signal propagation to the governance layer, not the agent layer. The most important architectural distinction in multi-agent kill switches is where the enforcement runs. If kill policy lives inside agent code — in the orchestrator's logic, in the sub-agent's system prompt — the agent must cooperate with its own shutdown. This is the structural gap the Stanford CodeX analysis identified in the Berkeley Profile's approach: "an optimization objective that treats shutdown as one more obstacle between the current state and the goal." An agent following a task objective has no reason to check whether an external signal has requested its termination. Kill policy must run at the infrastructure layer, checking governance state before every tool call, independently of what the agent's own logic decides to do.
Coordinated credential revocation. Terminating a session is necessary but not sufficient. A terminated session with live credentials is an orphaned agent waiting to be reactivated or exploited. A proper multi-agent kill sequence terminates sessions and revokes the grants that made them effective — in the right order, so that in-flight calls can be handled gracefully before access is cut, rather than leaving partial transactions outstanding. For agent systems that include external vendor agents or third-party integrations, credential revocation is more complex: the revocation mechanism must reach entities that aren't part of the same codebase, the same deployment, or the same organizational control.
What about KILLSWITCH.md?
The KILLSWITCH.md open standard, published in March 2026, addresses a related but distinct problem: auditability. It proposes a plain-text file convention — placed in a repository root alongside AGENTS.md — that documents an agent's cost limits, forbidden actions, and three-level escalation path (throttle → pause → full stop). The specification is designed to be readable by agents, engineers, and compliance teams. It explicitly targets EU AI Act requirements that take effect on August 2, 2026, which mandate documented shutdown capabilities for high-risk AI systems.
KILLSWITCH.md is a genuine contribution to the governance problem. Version-controlled, auditable shutdown policy is better than safety rules scattered across system prompts and Notion pages. The standard does not, however, address propagation. It is a per-agent specification. It tells one agent what to do when its own limits are reached. It has no mechanism for broadcasting a shutdown signal across a session graph, tracking sub-agent lineage, or revoking credentials across a distributed execution. A team that implements KILLSWITCH.md correctly has done something useful for single-agent auditability. They have not solved the multi-agent propagation problem.
The KILLSWITCH.md file convention and a governance-layer kill system are complementary, not alternatives. The file provides the policy specification and the audit record. The governance layer provides the enforcement mechanism that operates independently of what any agent in the graph chooses to do.
How Waxell handles this
Waxell Runtime's kill policy type terminates agent sessions at the infrastructure layer — not inside agent code — which means the kill signal reaches sub-agents through the same pre-call policy check that governs every tool invocation in the system. When an orchestrator session is targeted for termination, Waxell Runtime identifies every session in its lineage through the agent registry, which tracks session parent-child relationships in real time as sub-agents are dispatched. The kill signal propagates through the full session graph: orchestrator, dispatched sub-agents, any sessions those sub-agents spawned. Each session receives the termination signal at the governance layer before its next tool call executes, not through the agent's own logic.
For multi-agent workflows that include external agents — vendor agents, third-party integrations, MCP-native agents that weren't built in-house — Waxell Connect governs the agents you didn't build, with no SDK and no code changes required in the external agent itself. Connect operates at the connectivity layer: when an external agent is integrated through Waxell Connect, its tool calls pass through the same 26 policy categories that govern internally-built agents, including kill and circuit breaker policies. No rebuilds required. In a mixed swarm of internal and external agents, a kill signal reaches both through the same governance plane. The kill switch doesn't stop at the boundary of what your team built.
Waxell Runtime also provides circuit breaker policy at the session level: if a sub-agent exceeds its action count limit, cost threshold, or repeated-call threshold, it halts without requiring the orchestrator to notice and signal it. Circuit breakers fire at the governance layer, not the agent layer, which means a misbehaving sub-agent stops regardless of whether the orchestrator has been terminated or is itself looping.
The governance plane connects all of this: session lineage tracked in real time, kill signals that propagate through the session graph, circuit breakers that enforce independently at each session, and Waxell Observe capturing the full execution state of every session at the point of termination — initialized in 2 lines of code, so post-incident analysis is forensics on a known record, not reconstruction from fragmented logs.
The multi-agent kill switch problem is not a new risk. It is a new instance of an old principle: governance mechanisms designed for a single entity fail when applied to a distributed system without architectural changes. The principle held for distributed databases, for microservices, for container orchestration. It holds for multi-agent systems.
Teams discover this for the first time under pressure — an orchestrator stopped but sub-agents running, external effects completing that shouldn't have, credentials live in sessions no one is monitoring. The response is almost always the same: add session cleanup to the runbook, brief the on-call team, and treat it as an edge case. The edge case recurs at scale.
A kill switch that terminates a graph — not just an orchestrator — is what production multi-agent systems require. The architecture to build one is known. The question is whether it's in place before the incident or after.
To add governance-layer kill switch and circuit breaker capabilities to your agent fleet, get access to Waxell.
Frequently Asked Questions
What is a multi-agent kill switch?
A multi-agent kill switch is an emergency stop mechanism that terminates an entire agent execution graph — orchestrator, sub-agents, and any nested agents — rather than a single session. Unlike a single-agent kill switch, which targets one running process, a multi-agent kill switch must track session lineage in real time, propagate the termination signal through the full session graph, and coordinate credential revocation across all affected sessions. The mechanism must operate at the infrastructure layer, not inside agent code, because agent code cannot reliably cooperate with its own termination when it's pursuing an optimization objective.
Why doesn't stopping the orchestrator stop all sub-agents?
Sub-agents are dispatched at task assignment time and execute independently of the orchestrator's session state. When the orchestrator is terminated, sub-agents receive no signal unless the kill mechanism is specifically designed to propagate it. If kill policy lives inside the orchestrator's code, stopping the orchestrator stops only the orchestrator's own execution — the sub-agents continue running until they exhaust their objectives, hit an external limit, or are manually terminated. This is not a design flaw in any specific framework; it is the default behavior of any multi-agent system where sub-agents don't continuously check governance state.
What is the KILLSWITCH.md standard?
KILLSWITCH.md is an open file convention published in March 2026 that defines a per-agent emergency shutdown specification: cost limits, error thresholds, forbidden actions, and a three-level escalation path from throttle to full stop. It is designed to be placed in a repository root alongside AGENTS.md and read by both agents and compliance teams. KILLSWITCH.md addresses the auditability and documentation problem for single-agent systems. It does not provide a propagation mechanism for multi-agent systems — it specifies policy for one agent, not a kill signal that reaches a session graph.
How does a governance-layer kill switch differ from an in-code kill switch?
An in-code kill switch lives inside the agent's own execution context — it runs if the agent's logic reaches the relevant check. An agent under an optimization objective can miss that check, or the check may not run if the agent enters an unexpected execution path. A governance-layer kill switch runs at the infrastructure level, before every tool call, independently of what the agent's logic does. It cannot be bypassed by agent behavior because it doesn't run inside the agent. A March 2026 Stanford Law analysis of the Berkeley CLTC profile noted that the document cites evidence of models sabotaging shutdown mechanisms in 79 out of 100 tested scenarios — a figure traced to Palisade Research's study of OpenAI's o3 model operating without explicit shutdown instructions — but governance-layer enforcement doesn't rely on model cooperation, which is precisely why it must be at the infrastructure layer.
What happens to external agents in a multi-agent swarm when a kill signal fires?
External agents — vendor-built, third-party integrations, MCP-native agents — are typically outside the governance boundary of internally-built agents. A kill signal that propagates through your session graph doesn't reach them unless your governance layer extends to cover those connections. This requires that external agents connect through a governance proxy that can intercept their tool calls and apply the same kill and circuit breaker policies applied to internal agents. Without that extension, killing your orchestrator leaves external agents running with live credentials and no awareness that the workflow has been terminated.
Does the EU AI Act require kill switch documentation?
The EU AI Act provisions that take effect August 2, 2026 mandate human oversight capabilities and documented shutdown mechanisms for high-risk AI systems. The practical requirement is that organizations be able to demonstrate, to an auditor, that a shutdown mechanism exists and was documented before the system was deployed. The KILLSWITCH.md convention directly targets this documentation requirement. Whether a particular deployment falls under the high-risk classification depends on the AI Act's use-case categories, which organizations should assess with qualified legal counsel.
Sources
Kahana, E. "Kill Switches Don't Work If the Agent Writes the Policy: The Berkeley Agentic AI Profile Through the AILCCP Lens." Stanford Law School CodeX blog, March 7, 2026. — https://law.stanford.edu/2026/03/07/kill-switches-dont-work-if-the-agent-writes-the-policy-the-berkeley-agentic-ai-profile-through-the-ailccp-lens/
UC Berkeley Center for Long-Term Cybersecurity. Agentic AI Risk-Management Standards Profile. February 2026. By Nada Madkour, Jessica Newman, Deepika Raman, Krystal Jackson, Evan R. Murphy, Charlotte Yuan. — https://cltc.berkeley.edu/publication/agentic-ai-risk-profile/
KILLSWITCH.md Open Standard, v1.0. MIT licence. Published 2026. — https://killswitch.md/ (GitHub: github.com/WellStrategic/killswitch-md-spec)
Palisade Research. "Shutdown Resistance in Frontier AI Models." 2025. — https://palisaderesearch.org/blog/shutdown-resistance (Primary source for the 79/100 shutdown sabotage figure; study covers OpenAI o3 specifically, without explicit shutdown instructions)
1Kosmos. "The Ghost Agent Problem: When Employees Leave But AI Agents Keep Running." 2026. — https://www.1kosmos.com/resources/blog/ghost-agent-problem-employees-leave-ai-agents-keep-running
NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0). 2023. — https://doi.org/10.6028/NIST.AI.100-1
EU AI Act (Regulation (EU) 2024/1689). Digital Strategy, European Commission. — https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
Agentic Governance, Explained




