Operations Policy

The operations policy category enforces operational controls on workflow execution. Currently it has a single rule: timeout_seconds. The handler monitors run duration and generates warnings when agents exceed configured time limits.

Use it when you need SLA compliance monitoring, performance alerting, or tracking slow-running agents.

Rules

Rule	Type	Default	Description
`timeout_seconds`	integer (min 1)	`300`	Maximum allowed run duration in seconds. Checked post-execution for observed agents

How It Works

Post-Hoc Enforcement

Operations does NOT preemptively kill agents. The agent always runs to completion. Timeout violations are detected after the fact and recorded as governance incidents (WARN). This is correct for observe-path agents where external frameworks control execution.

Enforcement Phases

Phase	Behavior
`before_workflow`	Stores timeout in context. Returns ALLOW with "Timeout set to Xs"
`mid_execution`	Checks `context.duration` if available. Returns WARN if exceeded, ALLOW otherwise
`after_workflow`	Final duration check. Returns WARN if exceeded, ALLOW otherwise

Context Data

Attribute	Phase	Purpose
`context.duration`	mid_execution, after_workflow	Elapsed time of the workflow run (float, seconds)

Actions Returned

ALLOW -- duration within timeout, or no timeout configured, or duration not available
WARN -- duration exceeds timeout_seconds

The handler never returns BLOCK. Timeout violations are always WARN. This is by design -- for observe-path agents, the execution has already happened.

Example Policies

Strict SLA (60 seconds)

Alert on any run exceeding 1 minute:

{
  "timeout_seconds": 60
}

Standard Monitoring (5 minutes)

Default timeout with monitoring:

{
  "timeout_seconds": 300
}

Long-Running Batch (1 hour)

For batch processing agents that legitimately run longer:

{
  "timeout_seconds": 3600
}

SDK Integration

Using the Context Manager

import waxell_observe as waxell

waxell.init()

async with waxell.WaxellContext(
    agent_name="data-processor",
    enforce_policy=True,
) as ctx:
    # Operations policy stores timeout at before_workflow
    # Agent runs normally -- no blocking

    result = await process_data(query)
    ctx.set_result(result)

    # after_workflow checks: if total duration > timeout_seconds -> WARN
    # WARN is recorded but does NOT raise PolicyViolationError

Using the Decorator

@waxell.observe(
    agent_name="data-processor",
    enforce_policy=True,
)
async def process_data(query: str):
    # Operations checks happen after this function returns
    return await long_running_analysis(query)

Enforcement Flow

Agent starts (WaxellContext.__aenter__)
    |
    +-- before_workflow: stores timeout (e.g., 60s) in context
    |   -> ALLOW "Timeout set to 60s"
    |
    +-- Agent executes (duration tracked automatically)
    |   |
    |   +-- mid_execution (if triggered):
    |   |   -> duration < timeout? ALLOW "Within timeout (30.0s/60s)"
    |   |   -> duration > timeout? WARN "Mid-run: approaching timeout (75.0s/60s)"
    |   |
    |   +-- Agent continues regardless of mid_execution result
    |
    +-- Agent completes
    |
    +-- after_workflow: final duration check
        -> duration < timeout? ALLOW "Completed within timeout (45.0s/60s)"
        -> duration > timeout? WARN "Run exceeded timeout (120.0s/60s)"
        -> WARN recorded as governance incident

Creating via Dashboard

Navigate to Governance > Policies
Click New Policy
Select category Operations
Set timeout_seconds to your desired limit
Set scope to target specific agents (e.g., data-processor)
Enable

Creating via API

curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  https://acme.waxell.dev/waxell/v1/policies/ \
  -d '{
    "name": "SLA Timeout Monitor",
    "category": "operations",
    "rules": {
      "timeout_seconds": 60
    },
    "scope": {
      "agents": ["data-processor"]
    },
    "enabled": true
  }'

Observability

Governance Tab

Operations evaluations appear with:

Field	Example
Policy name	SLA Timeout Monitor
Action	`allow` or `warn`
Category	`operations`
Reason	"Completed within timeout (45.0s/60s)" or "Run exceeded timeout (120.0s/60s)"
Metadata	`{"duration": 120.0, "timeout": 60}`

Governance Incidents

Timeout violations create governance incidents visible in:

The trace's Governance tab
The Governance Incidents list
Compliance Console (if Insights is enabled)

Combining with Other Policies

Operations + Kill Switch: Use operations timeout warnings to feed into kill switch error rate monitoring. Repeated timeout violations may indicate an agent that should be killed.

Operations + Control: Combine timeout monitoring with cost threshold monitoring. Long-running agents often also consume more LLM tokens (higher cost).

Operations + Compliance: Include operations as a required category in a SOC 2 compliance profile to ensure all agents have timeout monitoring configured.

Common Gotchas

Returns WARN, never BLOCK. The agent always completes. Timeout violations are informational -- they create governance incidents but do not prevent execution.
context.duration may be None. If mid_execution fires before the duration attribute is populated, the handler returns ALLOW with a generic reason. Duration is reliably set by the time after_workflow runs.
Duration is calculated when WaxellContext closes. For observe-path agents, the SDK measures elapsed time between __aenter__ and __aexit__. This includes all LLM calls, tool calls, and any processing time.
Default timeout is 300 seconds (5 minutes). If you configure an operations policy without specifying timeout_seconds, it defaults to 300s.
Simulated duration in dry-run requires manual setting. Demo agents set context._duration_override to simulate elapsed time. In production, actual wall-clock time is used automatically.

Next Steps

Policy & Governance -- How policy enforcement works
LLM Policy -- Model allowlists and token limits
Quality Policy -- Output quality validation
Policy Categories & Templates -- All 26 categories

Rules​

How It Works​

Enforcement Phases​

Context Data​

Actions Returned​

Example Policies​

Strict SLA (60 seconds)​

Standard Monitoring (5 minutes)​

Long-Running Batch (1 hour)​

SDK Integration​

Using the Context Manager​

Using the Decorator​

Enforcement Flow​

Creating via Dashboard​

Creating via API​

Observability​

Governance Tab​

Governance Incidents​

Combining with Other Policies​

Common Gotchas​

Next Steps​