Skip to main content

Connect Agent Governance

This page is the operator's overview of how Waxell governs external agents — agents you didn't author with the Waxell SDK but that you want to bring under policy control. Today that means:

  • Claude Code (Anthropic's CLI coding agent)
  • Claude Cowork (Anthropic's hosted multi-agent product)
  • Codex (OpenAI's CLI coding agent)
  • MCP agents (any third-party agent that speaks Model Context Protocol)
  • Custom agents (your own scripts registered against a Connect API key)

If you're building an agent with the Waxell SDK, see the Observe governance docs — the same policy categories apply, but the enforcement story is simpler because the runtime is yours.


Why Connect governance

A typical mid-size company in 2026 has Claude Code installed on every developer laptop, Cowork running unattended overnight on a shared dev box, Codex in CI, and a half-dozen MCP-based agents glued together by ops. None of these run inside your code. None of them go through a single chokepoint you control. And every one of them spends money, touches files, and calls third-party APIs on your behalf.

The problems:

  • Spend opacity. Token costs land in three different vendor invoices.
  • Shadow tool use. Engineers add MCP servers (GitHub, Notion, Postgres, anything) to their local agent without anyone reviewing them.
  • No audit consistency. Each agent has its own log format. Compliance can't answer "show me every code-execution event for engineer X last week."
  • No kill switch. When an agent loops or burns through a budget, you find out from the credit-card alert.

Connect governance is the lever-set for these problems: one policy model, one provenance trail, one place to configure it, applied consistently across every external agent you have running.


How it works (architecture in 30 seconds)

┌──────────────────────┐        ┌──────────────────┐
│ External agent on │ │ Cloud chokepoint│
│ user's machine │ ─────► │ (LLM proxy + │ ── policy eval ──► allow / block / warn
│ (CC / Codex / MCP) │ │ Connect domains)│
└──────────┬───────────┘ └──────────────────┘

├── Local guard / hooks / network filter ── policy eval (preventive)

└── Audit stream (Cowork only) ─── policy eval (detective, ~5s lag)

Three enforcement seams:

  1. Cloud chokepoint — every Connect-routed call (POST /domains/{entity}/{action} or LLM proxy request) gets evaluated server-side before the action runs. Preventive.
  2. On-machine — Claude Code hooks, Codex wrappers, the network filter, and the cowork MCP shim catch actions before they happen on the laptop. Preventive when the lever is on this seam.
  3. Audit stream — for Cowork (and other agents that we can only watch, not intercept) we stream audit.jsonl events to the cloud and evaluate them post-hoc. Detective.

Every block/warn/alert carries a decision provenance record (metadata.provenance on the result) with policy_id, agent_id, agent_type, enforcement_model, phase, and surface — see Decision provenance below.


Enforcement models

Each policy carries an enforcement_model field with one of three values. Operators choose the model when they create the policy; for some categories, the model is implicit (audit is always detective; allowlists are always preventive).

ModelMeaningWhen the decision happensCan it stop the action?
preventiveBlock-before-actionSynchronously, before the action runsYes (raises a violation that the agent's chokepoint must honor)
detectiveAlert-after-actionAfter the event lands in the audit / event streamNo — the action already happened. Auto-kill / revoke can be triggered reactively.
shapingGuide via prompt / tool listUp front, by what the agent can seeIndirectly — the agent never tries the action

Per agent type

Not every model is available on every agent. The matrix below — backed by the surface field on decision provenance (G6) and the COWORK_ENFORCEMENT_MATRIX.md — describes what's possible:

Agent typepreventivedetectiveshapingNotes
claude-codeHooks block before action; OTel + audit catch what hooks miss; system prompt + tool list shape behavior.
claude-coworkpartialNo in-VM hooks. Preventive only at cloud chokepoint and via the MCP shim's waxell_check_policy. Local file edits / terminal commands cannot be blocked — see Cowork governance topology.
codexpartialpartialCodex has no public hook API. Preventive at cloud chokepoint + network filter; detective via session telemetry. Shaping via system-prompt config.
mcp (generic)✅ at cloudThe whole agent talks to us through Connect domains — every call is the chokepoint.
customdependsWhatever your wrapper code chooses to enforce.

Decision provenance

Every governance decision carries a provenance record. Example payload (returned by the cloud handshake, surfaced in trace detail UI):

{
"metadata": {
"provenance": {
"policy_id": "abc123",
"policy_name": "Claude Code Daily Budget",
"policy_category": "cost",
"enforcement_model": "preventive",
"phase": "before_workflow",
"surface": "cloud",
"agent_id": "agent_xyz",
"agent_type": "claude-code",
"agent_groups": ["engineering", "interns"]
}
}
}

The surface field (one of in-process | cloud | network | audit-stream) tells you where the decision was made — useful when debugging "why did this not block on my laptop." See infra/waxell-infra/src/waxell_infra/policies/dynamic/manager.py _attach_provenance for the implementation.


Scope dimensions

A single policy can target which agents to apply to via several scope dimensions. They compose with AND semantics — a policy with both scope_agents and scope_agent_groups set applies to agents that match both.

DimensionFieldExample
Agent namescope_agent_names["my-coding-bot", "my-research-bot"]
Agent IDscope_agent_ids["agent_xyz"]
Agent typescope_agent_types["claude-code", "codex"]
Agent groupsscope_agent_groups["engineering"] (groups defined in ConnectAgentGroup)
Enforcement modelscope.enforcement_models["preventive"] (skip the same policy from re-firing as a detective check)

Cookbook

"Apply this only to Claude Code agents in the 'interns' group"

{
"name": "Intern Claude Code Cap",
"category": "cost",
"scope_agent_types": ["claude-code"],
"scope_agent_groups": ["interns"],
"config": {
"daily_cost_limit": 5.00,
"action_on_exceed": "block"
}
}

"Apply to all external agents — claude-code, codex, and mcp"

{
"name": "External Egress Allowlist",
"category": "network",
"scope_agent_types": ["claude-code", "codex", "mcp"],
"config": {
"allowed_domains": ["api.openai.com", "api.anthropic.com", "*.internal.com"],
"block_external": true,
"action_on_violation": "block"
}
}

"Detective audit on everything, no scope filter"

{
"name": "Org-wide Audit",
"category": "audit",
"enforcement_model": "detective",
"config": {
"log_inputs": true,
"log_outputs": true,
"retention_days": 365
}
}

Lifecycle: from key to enforced policy

When a Connect agent registers via the resolve-key endpoint, Waxell does a handshake that returns the agent's currently-applicable governance state. This is D3 in the implementation plan and lands at controlplane/.../views/connect_views.py:_build_governance_handshake (~lines 1268-1338).

┌─────────┐  POST /api/v1/connect/agents/resolve-key/   ┌──────────────────┐
│ Agent │ ───────────────────────────────────────────►│ Controlplane │
└─────────┘ └──────┬───────────┘

│ resolve agent
│ collect attached templates
│ compute guard etag

┌─────────┐ 200 { agent_id, governance: { ┌──────▼───────────┐
│ Agent │◄── attached_template_names: [...], │ Controlplane │
└─────────┘ guard_etag: "...", └──────────────────┘
mcp_tool_hints: [...]
} }

The agent (or its installer / hooks layer) then:

  1. Caches the attached template names so the local guard knows what categories to enforce locally.
  2. Stores the guard_etag so it can poll /api/v1/connect/guard-config/ for changes.
  3. Registers the mcp_tool_hints (e.g. waxell_check_policy, waxell_budget_status, waxell_record_decision) as MCP tools it can call mid-session. This is the "ask the cloud if I can do this" path used by Cowork.

From that point on:

  • Every Connect domain call goes through the policy interpreter on the cloud.
  • Every Claude Code hook event hits the local guard, which evaluates against the cached config.
  • Every Cowork session writes to audit.jsonl; the watcher streams it to the cloud where detective evaluators fire.

If you re-register or rotate the API key, the handshake re-runs and you get a fresh guard_etag. Local guards re-fetch config when the etag changes.


Policy categories

The 26 standard categories and 3 allowlist categories (A6) seeded by infra/waxell-infra/src/waxell_infra/policies/seed.py. For applicability per agent type, see POLICY_APPLICABILITY.md. For which templates ship pre-configured per type, see POLICY_COVERAGE_MATRIX.md.

The categories most relevant to Connect agents:

CategoryWhat it doesNotes for Connect
costToken / dollar caps per session and per dayPer-agent and per-group windows
rate-limitRPS / RPM / concurrent-session capsCloud chokepoint enforcement
killAuto-disable on error rateReactive — kills the agent's API key
auditCapture inputs/outputs/steps with retentionAlways detective
safetymax_steps, max_tool_calls, deny-list of toolsPairs with tool-allowlist for positive control
networkDomain allow/block, protocol restrictionsEnforced at network filter (desktop app) + LLM proxy
scopeRecords-modified / files-changed / transaction capsCounts come from hook payloads + audit stream
code-executionSandbox required, blocked commands, allowed languagesHook-layer for CC; audit for Cowork
contentPII / credential / prompt-injection scanningRuns on inputs and outputs that cross the cloud
privacyData minimization, residency, retention by typeDetective — applies to what we capture
identityAI disclosure footer + impersonation judgeShaping (prepend/append) + post-hoc judge
complianceMeta-validator (HIPAA, SOC 2, etc.)Asserts other categories are configured

Codex-specific templates

OpenAI's Codex CLI is the third-party agent we explicitly support for governance day-one. The seed ships 8 Codex templates (all scope_agents=["codex"]):

  1. Codex Session Budget (cost, preventive) — seed.py:1638-1666
  2. Codex Daily Budget (cost, preventive) — seed.py:1667-1695
  3. Codex Rate Limit (rate-limit, preventive) — seed.py:1696-1715
  4. Codex Kill Switch (kill, preventive) — seed.py:1716-1737
  5. Codex Tool Boundary (safety, preventive) — seed.py:1738-1759
  6. Codex Network Egress (network, preventive) — seed.py:1760-1783
  7. Codex File Scope (scope, preventive) — seed.py:1784-1805
  8. Codex Full Audit (audit, detective) — seed.py:1806-1828

There's no Codex Model Restriction template — Codex picks its own model and we don't intercept that path today. The set deliberately mirrors Claude Code's cost / rate-limit / kill / safety / network / scope coverage with Codex-tuned defaults.


Cowork governance topology

Cowork is different. This subsection exists because operators repeatedly assume Cowork enforcement looks like Claude Code's. It does not.

Claude Cowork runs in a sandboxed VM that we don't control. There is no ~/.claude/settings.json to install hooks into, and no PreToolUse-style interception is possible for actions Cowork takes inside the VM (file edits, terminal commands, third-party MCP calls, raw model conversation).

What we do have:

  • Detective evaluation on the audit stream. Every Cowork session writes audit.jsonl to ~/Library/Application Support/Claude/local-agent-mode-sessions/ (macOS) / %APPDATA%\Claude\local-agent-mode-sessions\ (Windows). The watcher (observe/.../integrations/cowork/audit_watcher.py) tails these files and streams events to the cloud. Detective policies fire ~5 seconds after the action.
  • Preventive enforcement at the cloud chokepoint. Anything Cowork does through Connect (creating tasks, posting to channels, uploading files, querying tables) is governable preventively — those calls hit our domain endpoints.
  • Reactive kill via ConnectAgentLinkedKey revocation. When a detective policy fires, we can revoke the agent's keys, killing the next API call. The current session continues until it tries to refresh credentials.
  • Shaping via the MCP shim. The Cowork MCP shim (observe/.../integrations/cowork/mcp_server.py) exposes three tools: waxell_check_policy(action_kind, action_summary), waxell_budget_status(), waxell_record_decision(reasoning, decision, confidence). A well-configured Cowork session calls waxell_check_policy before taking risky actions. This is the proactive shaping path — Cowork voluntarily checks in with us. See E7 in EXTERNAL_AGENT_GOVERNANCE.md.

What this means in the UI: every Cowork policy detail page should show:

Cowork enforcement: detective. Violations are alerted within ~5s; auto-kill is optional.

Don't write copy that promises preventive enforcement on local Cowork actions. Find the cloud chokepoint, the audit-stream check, or the MCP-shim shaping path instead.

For the per-category Cowork-applicability matrix, see COWORK_ENFORCEMENT_MATRIX.md (E2 in the engineering plan).


Allowlists (A6) — the three new categories

Three categories were added in 2026-05-03 specifically for the external-agent governance story:

tool-allowlist

Positive-list governance for which tools an agent may invoke. Differs from safety.blocked_tools (deny-only) by adding an explicit allowlist — absence from the allowlist blocks. Critical for Codex / generic MCP whose tool surface is unbounded.

{
"category": "tool-allowlist",
"config": {
"allowed_tools": ["Read", "Write", "Edit", "Bash"],
"blocked_tools": [],
"action_on_violation": "warn"
}
}

Recommended rollout: ship as warn first, populate allowed_tools from observed traffic, then promote to block.

mcp-server-allowlist

Constrain which MCP servers an agent may register or connect to. Today's MCP ecosystem lets agents register arbitrary third-party servers; this lever pins them to a curated set.

{
"category": "mcp-server-allowlist",
"config": {
"allowed_servers": ["github", "filesystem", "waxell-observe-claude-code"],
"blocked_servers": [],
"action_on_violation": "block"
}
}

Especially important for mcp agent type — that's the whole point of an MCP agent.

prompt-allowlist

Constrain which named system-prompt templates an agent may load. Cross-references the prompt-management system. Use case: enterprise compliance where prompts are vetted assets and engineers shouldn't be free to swap them.

{
"category": "prompt-allowlist",
"config": {
"allowed_prompts": ["legal-reviewed-v3", "approved-internal-prompt"],
"blocked_prompts": [],
"action_on_violation": "block"
}
}

Honesty note: Today these three categories evaluate at the cloud chokepoint only. Local-tool calls and locally-registered MCP servers bypass them until the desktop policy poller (UNIFIED_DESKTOP_APP plan, Phase B) ships and consumes the new attached_template_names from the D3 handshake. Until then, treat allowlists as detective-only on claude-code / codex and preventive on mcp / claude-cowork (which talk to us through cloud chokepoints anyway).


Operator workflow

A typical "I want to add governance to a new external agent" flow:

  1. Register the agent. From the Connect UI: /connect/agents/new → choose agent_type → save. Or programmatically: POST /api/v1/connect/agents/ with agent_type: "claude-code". You get back a slug and api_key.
  2. Install on the user machine. Run the Waxell installer — it writes ~/.waxell/config and (for CC) ~/.claude/settings.json hook entries.
  3. Attach templates. From /connect/agents/<slug> → Governance tab → "Attach template" → pick from the templates that show under the agent's type.
  4. Verify. Trigger a small action; check the trace's Governance tab. Provenance should show your policy as the matching scope. If you see surface: "cloud" but expected "in-process", the local guard hasn't picked up the etag yet — reload the agent.
  5. Tighten. Start with warn. After a week of clean signal, promote to block.

For engineers

If you're implementing or extending Connect governance, the engineering doc is the authoritative reference:

See also