waxell.yaml Reference
The waxell.yaml file is the declarative specification for a Waxell agent. It defines what an agent is, what it can do, and what governance constraints it operates under. The runtime reads this file at deploy time (wax push) and uses it to instantiate the agent, register tools, wire workflows, and enforce guards.
Minimal example
version: 1
agents:
- name: my_agent
version: 1.0.0
description: A simple agent.
system_prompt: |
You are a helpful assistant.
tools: [my_tool]
workflows: [my_workflow]
tools:
- name: my_tool
description: Does something useful.
inputs:
query:
type: string
outputs:
result:
type: string
workflows:
- name: my_workflow
type: graph
nodes: [step_one]
edges: []
Full example
version: 1
defaults:
framework: autogen
owner: team@example.com
tags: [production]
agents:
- name: contract_negotiator
version: 1.0.3
description: Negotiates contract terms with vendors.
display_name: Contract Negotiator
tags: [contracts, procurement]
model: openai:gpt-4o-mini
system_prompt: |
You are a contract negotiation assistant.
system_prompt_ref:
name: contract_negotiator_system
label: production
tools: [calculator, clause_lookup]
workflows: [escalation, negotiation]
domains: [crm, contracts]
capabilities: [waxell.hooks]
timeout_seconds: 90
max_correction_retries: 2
llm_config:
default_model: openai:gpt-4o-mini
task_routes:
email:
models: [openai:gpt-4o-mini, openai:gpt-4o]
temperature: 0.7
max_tokens: 2000
inherit_defaults: true
guards:
- type: cost
limit: $2.00
scope: tree
action: abort
- type: spawn_depth
limit: 3
action: abort
- type: turn_count
limit: 30
scope: run
action: warn
memory:
conversation:
scope: session_id
type: list
tier: episodic
ttl: 30d
max_items: 100
scratchpad:
tier: working
scope: [run_id]
auto_capture: true
ttl: 1h
signals:
- name: proposal_received
source_type: webhook
schema:
vendor_id: string
proposal_id: string
triggers_workflow: negotiation
idempotency_key: $.proposal_id
policies:
- block_pii_leakage
- name: require_human_review
conditions:
- output_contains_dollar_amount_above(50000)
mcps:
- server: github
tool_allowlist: [search_repos, get_file]
- server: filesystem
transport: stdio
command: npx
args: [-y, "@anthropic/mcp-filesystem"]
prompts:
- name: negotiation_guide
version_label: production
metacog:
triage: true
plan: true
reflect: true
inject_learnings: 5
hallucination_policy:
action: retry_then_fail
max_retries: 1
telemetry:
alerts:
- p95_latency_ms > 5000
- error_rate > 0.05
sample_rate: 1.0
tools:
- name: calculator
version: 1.0.1
description: Basic arithmetic.
tags: [math, utility]
inputs:
a:
type: number
b:
type: number
op:
type: string
enum: [add, sub, mul, div]
outputs:
result:
type: number
workflows:
- name: escalation
version: 1.0.1
type: graph
description: Human-in-the-loop escalation.
nodes: [triage, draft_response, human_review, send]
edges:
- {from: triage, to: draft_response}
- {from: draft_response, to: human_review}
- {from: human_review, to: send}
Top-level fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
version | integer | Yes | - | Spec format version. Currently 1. |
defaults | object | No | {} | Project-wide defaults inherited by all agents/tools in this file. |
agents | list[Agent] | Yes | - | List of agent definitions. |
tools | list[Tool] | No | [] | Tool definitions referenced by agents. |
workflows | list[Workflow] | No | [] | Workflow definitions referenced by agents. |
defaults
| Field | Type | Default | Description |
|---|---|---|---|
framework | string | - | Default agent framework (e.g., autogen). |
owner | string | - | Default owner email. |
tags | list[string] | [] | Default tags applied to all agents. |
agents
Each entry in the agents list defines one agent. An agent is the top-level unit of deployment.
Core fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | Yes | - | Unique agent identifier. Used in URLs, CLI, and traces. Must be a valid Python identifier (letters, digits, underscores). |
version | string | No | - | Semantic version string (e.g., 1.0.3). |
description | string | No | "" | Human-readable description. Shown in the controlplane UI. |
display_name | string | No | - | UI-friendly name (can contain spaces, punctuation). |
tags | list[string] | No | [] | Classification tags. Merged with defaults.tags. |
model | string | No | - | Default LLM model in provider:model format (e.g., openai:gpt-4o-mini). Overridden by llm_config task routes. |
timeout_seconds | integer | No | - | Maximum wall-time for a single agent run. |
max_correction_retries | integer | No | 2 | Cap on the router's retry-with-feedback loop. Set to 0 for expensive single-call agents. |
Prompt configuration
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
system_prompt | string | No | - | Inline system prompt. Mutually exclusive with system_prompt_ref in practice; if both are set, system_prompt_ref takes precedence at runtime. |
system_prompt_ref | object | No | - | Reference to a prompt in the prompt registry. |
system_prompt_ref.name | string | Yes | - | Prompt name in the registry. |
system_prompt_ref.label | string | No | "production" | Version label. Allows promoting prompts without redeploying. |
Bindings
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
tools | list[string] | No | [] | Tool names this agent can use. Must match entries in the top-level tools section or be registered via wax push-tool. |
workflows | list[string] | No | [] | Workflow names this agent can execute. Must match entries in the top-level workflows section or be registered via wax push-workflow. |
domains | list[string] | No | [] | External domain names (e.g., crm, email). Resolved against the tenant's registered domains at runtime. |
capabilities | list[string] | No | [] | Named capabilities this agent exposes. Each maps to a tool, workflow, or domain action. |
Relationships
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
signals | list[Signal] | No | [] | Inbound event definitions. See signals section. |
memory | map[string, Memory] | No | {} | Named memory slots. See memory section. |
policies | list | No | [] | Governance policy references. Can be bare strings (policy names) or objects with name and conditions. |
mcps | list[MCPBinding] | No | [] | MCP server bindings. See mcps section. |
prompts | list[PromptBinding] | No | [] | Prompt registry bindings. See prompts section. |
guards | list[Guard] | No | [] | Declarative governance guards. See guards section. |
llm_config | object | No | - | LLM routing configuration. See llm_config section. |
metacog | object | No | - | Metacognition configuration. See metacog section. |
hallucination_policy | string or object | No | - | Hallucinated-tool-call recovery policy. See hallucination_policy section. |
telemetry | object | No | - | Telemetry and alerting configuration. See telemetry section. |
tools
Each entry in the top-level tools list defines a tool the agent can call.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | Yes | - | Unique tool identifier. Referenced by agents in their tools list. |
version | string | No | - | Semantic version string. |
description | string | No | "" | What the tool does. Shown to the LLM as tool documentation. |
tags | list[string] | No | [] | Classification tags. |
async_only | boolean | No | false | If true, tool must be called asynchronously. |
inputs | map | No | {} | Input parameter definitions. See tool inputs. |
outputs | map | No | {} | Output field definitions. |
schema | object | No | - | Unified JSON Schema for inputs/outputs. Alternative to separate inputs/outputs. |
Tool inputs
Each key in inputs is a parameter name. The value is an object:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
type | string | No | "string" | JSON Schema type: string, number, integer, boolean, array, object. |
description | string | No | - | Parameter description shown to the LLM. |
enum | list | No | - | Allowed values. |
required | boolean | No | false | Whether the parameter is required. |
default | any | No | - | Default value if not provided. |
items | object | No | - | For array type: schema of array elements. |
format | string | No | - | JSON Schema format hint (e.g., email, uri). |
pattern | string | No | - | Regex pattern for validation. |
minLength | integer | No | - | Minimum string length. |
maxLength | integer | No | - | Maximum string length. |
minimum | number | No | - | Minimum numeric value. |
maximum | number | No | - | Maximum numeric value. |
Example:
tools:
- name: search
description: Search a knowledge base.
inputs:
query:
type: string
description: Search query
required: true
limit:
type: integer
description: Max results
default: 10
minimum: 1
maximum: 100
outputs:
results:
type: array
total_count:
type: integer
workflows
Each entry defines a workflow the agent can execute.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | Yes | - | Unique workflow identifier. |
version | string | No | - | Semantic version string. |
type | string | No | - | Workflow type. Common values: graph, linear, router. |
description | string | No | "" | What the workflow does. |
nodes | list[string] | No | [] | Node names in the workflow graph. |
edges | list[Edge] | No | [] | Edges connecting nodes. |
steps | list[Step] | No | [] | Alternative to nodes/edges for linear workflows. |
routes | list[Route] | No | [] | Routing rules for decision-based workflows. |
metadata | object | No | {} | Arbitrary key-value metadata. |
inputs_schema | object | No | {} | JSON Schema for workflow inputs. |
Edge format
edges:
- {from: node_a, to: node_b}
- {from: node_b, to: node_c, condition: "result.status == 'ok'"}
Step format (linear workflows)
steps:
- id: step_0
type: tool # tool | workflow | decision
name: my_tool
inputs:
query: "$.signal.query"
produces: sym_0
| Field | Type | Required | Description |
|---|---|---|---|
id | string | Yes | Unique step identifier. |
type | string | Yes | Step type: tool, workflow, or decision. |
name | string | Yes | Name of the tool/workflow/decision to invoke. |
inputs | object | No | Input mappings. Can reference prior step outputs via $. syntax. |
produces | string | No | Symbol name for this step's output. Referenced by later steps. |
signals
Signals define inbound events that trigger agent execution.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | Yes | - | Signal identifier. Used in webhook URLs: /api/v1/signals/<name>/. |
source_type | string | No | "webhook" | Source type: webhook, queue, schedule, api, internal. |
description | string | No | - | Human-readable description. |
schema | object | No | {} | Expected payload shape. Keys are field names, values are types. |
triggers_workflow | string | No | - | Workflow to run when this signal fires. |
idempotency_key | string | No | - | JSONPath expression for deduplication (e.g., $.proposal_id). |
ttl_seconds | integer | No | - | Time-to-live for the signal in the queue. |
priority | integer | No | 0 | Processing priority (higher = sooner). |
tags | list[string] | No | [] | Classification tags. |
Example:
signals:
- name: document_uploaded
source_type: webhook
description: Triggered when a user uploads a document
schema:
document_id: string
filename: string
size_bytes: number
triggers_workflow: process_document
idempotency_key: $.document_id
ttl_seconds: 3600
guards
Guards are declarative governance constraints enforced at runtime. They limit cost, depth, fanout, and other execution dimensions.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
type | string | Yes | - | Guard type. See table below. |
limit | string, number | Yes | - | Threshold. Format depends on type. |
scope | string | No | "run" | Scope: run (single execution) or tree (including spawned children). |
action | string | No | "abort" | What happens when the limit is hit: abort, warn, require_approval. |
tool | string | Conditional | - | Required when type is tool_call_count. Specifies which tool to count. |
Guard types
| Type | Limit format | Description |
|---|---|---|
cost | "$2.00" or 2.0 | Total LLM cost in USD. |
token_count | integer | Total tokens consumed. |
turn_count | integer | Number of LLM turns. |
spawn_depth | integer | Max depth of spawned child agents. |
spawn_fanout | integer | Max number of concurrent child agents. |
spawn_concurrent | integer | Max simultaneously running child agents. |
tool_call_count | integer | Max calls to a specific tool (requires tool field). |
wall_time | "30s", "5m", "1h" | Max wall-clock time for the execution. |
Example:
guards:
- type: cost
limit: $2.00
scope: tree
action: abort
- type: spawn_depth
limit: 3
action: abort
- type: turn_count
limit: 30
scope: run
action: warn
- type: tool_call_count
tool: propose_node
limit: 15
scope: run
action: warn
- type: wall_time
limit: 5m
action: abort
memory
Named memory slots scoped to different dimensions and persistence tiers.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
tier | string | No | "episodic" | Memory tier: working, session, episodic, semantic. |
scope | string or list[string] | No | "tenant_id" | Isolation boundary. Single dimension or list of dimensions. tenant_id is always enforced automatically. |
type | string | No | "dict" | Data structure: dict, list, value. |
description | string | No | "" | Human-readable description. |
ttl | string | No | - | Time-to-live: "1h", "30d", "7d", etc. |
max_items | integer | No | - | For list type: max items before oldest are evicted. |
custom_scope_key | string | No | - | When scope is "custom", the input field to use as the key. |
auto_capture | boolean | No | true (working) | Working tier: auto-store every tool result. |
reference_syntax | string | No | "$ref" | Working tier: prefix for LLM $ref:name.N references. |
max_size_mb | integer | No | 20 | Working tier: soft cap on scratchpad size per run. |
searchable | boolean | No | true (semantic) | Semantic tier: enable vector embedding and retrieval. |
embedding_model | string | No | - | Semantic tier: override default embedding model. |
schema | string | No | - | Fully-qualified Python class for typed validation (e.g., myapp.schemas.NoteSchema). |
schema_version | integer | No | 1 | Schema version for migration support. |
Memory tiers
| Tier | Backend | Lifecycle | Use case |
|---|---|---|---|
working | Redis | Per-run, cleared on completion | Scratchpad for in-flight tool results. |
session | Postgres | 24h default TTL | Conversation/plan state within a user session. |
episodic | Postgres | TTL-based | Cross-run state: conversation history, cached results. |
semantic | pgvector | Indefinite | Long-term facts with semantic search. |
Scope dimensions
| Dimension | Source | Description |
|---|---|---|
tenant_id | Always | Outermost boundary (automatic). |
agent | Agent name | Per-agent within tenant. |
agent_version | Agent version | Per-agent version. |
user_id | Sub-user identity | Per end-user. |
user_group | Sub-user identity | Per user group. |
user_email | Sub-user identity | Per user email. |
session_id | Conversation session | Per conversation session. |
workflow | Workflow name | Per workflow. |
run_id | Execution run | Per workflow execution (ephemeral). |
channel_id | Slack/chat | Per chat channel. |
thread_ts | Slack | Per thread. |
Example:
memory:
# Working memory — auto-captures tool results
scratchpad:
tier: working
scope: [run_id]
auto_capture: true
ttl: 1h
# Episodic — per-user conversation history
conversation:
tier: episodic
scope: user_id
type: list
ttl: 30d
max_items: 100
# Semantic — long-term facts with vector search
client_knowledge:
tier: semantic
scope: [user_id, portfolio_id]
searchable: true
description: Facts about this client relationship
llm_config
Per-agent LLM routing configuration. Controls which models handle which task types.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
default_model | string | No | - | Default model when no task-specific route matches. Format: provider:model. |
task_routes | map[string, TaskRoute] | No | {} | Per-task routing rules. |
inherit_defaults | boolean | No | true | Whether unmatched tasks fall through to tenant/global defaults. |
custom_task_types | list[string] | No | [] | Additional task type names beyond the built-in set. |
TaskRoute
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
models | list[string] | Yes | - | Ordered model list: primary + fallbacks. |
temperature | number | No | - | Temperature override for this task. |
max_tokens | integer | No | - | Token limit override for this task. |
Example:
llm_config:
default_model: openai:gpt-4o-mini
task_routes:
email:
models: [openai:gpt-4o-mini, openai:gpt-4o]
temperature: 0.7
max_tokens: 2000
classify:
models: [openai:gpt-4o-mini]
temperature: 0.1
research:
models: [anthropic:claude-sonnet-4-5-20250929]
temperature: 0.3
max_tokens: 4000
inherit_defaults: true
mcps
MCP (Model Context Protocol) server bindings. Each binding declares that the agent uses a registered MCP server.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
server | string | Yes | - | MCP server slug. Must be registered in the tenant's MCPServer registry. |
tool_allowlist | list[string] | No | [] | Only expose these tools from the server. Empty = all tools. |
tool_blocklist | list[string] | No | [] | Hide these tools from the agent. |
tool_descriptions | map[string, string] | No | {} | Agent-specific tool descriptions (override the server's defaults). |
enabled | boolean | No | true | Whether this binding is active. |
transport | string | No | "http" | Transport type: http or stdio. |
command | string | Conditional | - | Required for stdio transport. Command to launch the MCP server process. |
args | list[string] | No | [] | Arguments for the stdio command. |
env | map[string, string] | No | {} | Environment variables for the stdio process. |
sandbox | object | No | defaults | Sandbox configuration for stdio processes. |
Sandbox (stdio only)
| Field | Type | Default | Description |
|---|---|---|---|
fs_allow | list[string] | [] | Read-only filesystem paths the process can access. |
net | boolean | false | Allow network egress. |
cpu_limit | number | 0.5 | CPU-seconds/sec soft cap. |
mem_limit_mb | integer | 256 | RSS memory cap in MB. |
max_duration_seconds | integer | 300 | Hard wall-time for the session. |
Example:
mcps:
# HTTP transport — server registered in tenant MCPServer registry
- server: github
tool_allowlist: [search_repos, get_file, list_issues]
# stdio transport — launches a local process
- server: filesystem
transport: stdio
command: npx
args: [-y, "@anthropic/mcp-filesystem"]
env:
HOME: /tmp/sandbox
sandbox:
fs_allow: [/data/documents]
net: false
mem_limit_mb: 128
max_duration_seconds: 60
prompts
Prompt registry bindings. Each binding links the agent to a versioned prompt.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | Yes | - | Prompt name in the registry. |
version_label | string | No | "production" | Version label pin. Allows promoting prompts without redeploying. |
order | integer | No | 0 | Injection order when multiple prompts are bound. |
enabled | boolean | No | true | Whether this binding is active. |
Example:
prompts:
- name: sales_playbook
version_label: production
- name: compliance_rules
version_label: production
order: 1
metacog
Per-agent metacognition configuration. Enables conditional planning (triage), multi-step planning, and reflexion (learning from outcomes).
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
triage | boolean | No | true | Run a triage step before each dispatch to classify complexity. |
plan | boolean | No | true | Generate a multi-step plan on hard turns. |
reflect | boolean | No | true | Run reflexion at workflow exit to generate learnings. |
inject_learnings | integer | No | 5 | Top-N learnings from prior runs injected into prompts. Set to 0 to disable. |
reflect_on_trivial | boolean | No | false | Reflect on trivial turns too (not just hard turns or failures). |
triage_prompt | string | No | built-in | Override the triage prompt template. |
plan_prompt | string | No | built-in | Override the plan prompt template. |
reflect_prompt | string | No | built-in | Override the reflect prompt template. |
triage_max_tokens | integer | No | 200 | Token cap for triage calls. |
plan_max_tokens | integer | No | 800 | Token cap for plan calls. |
reflect_max_tokens | integer | No | 300 | Token cap for reflect calls. |
Example:
metacog:
triage: true
plan: true
reflect: true
inject_learnings: 5
reflect_on_trivial: false
triage_max_tokens: 200
hallucination_policy
Controls how the runtime recovers when an LLM hallucinates a tool call (calls a tool that doesn't exist).
String shortcut: Use a bare action name for the common case.
hallucination_policy: retry_then_fail
Object form: Full control.
| Field | Type | Default | Description |
|---|---|---|---|
action | string | "retry_then_fail" | Recovery action. See table below. |
max_retries | integer | 1 | Max retries before the action escalates. |
Actions
| Action | Description |
|---|---|
retry_then_fail | Retry with feedback, then fail if the model hallucinates again. |
retry_then_ask_user | Retry with feedback, then escalate to the user. |
fail_immediately | Fail immediately on the first hallucination. |
warn_only | Log a warning but continue execution. |
ask_user | Alias for retry_then_ask_user. |
Example:
hallucination_policy:
action: retry_then_fail
max_retries: 2
telemetry
Agent-level telemetry and alerting configuration.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
alerts | list[string] | No | [] | Alert rules as condition expressions. |
sample_rate | number | No | 1.0 | Trace sampling rate (0.0 to 1.0). |
Example:
telemetry:
alerts:
- p95_latency_ms > 5000
- error_rate > 0.05
- cost_per_run > 0.50
sample_rate: 1.0
policies
Agent-level governance policy references. Can be bare strings (referencing policies by name) or objects with conditions.
String form:
policies:
- block_pii_leakage
- require_human_review
Object form:
policies:
- name: require_human_review
conditions:
- output_contains_dollar_amount_above(50000)
- vendor_tier == 'tier_1'
Policies are resolved against the tenant's policy registry at runtime. Policy categories include: rate-limit, budget, safety, kill, audit, operations, quality, control, llm, scheduling.
See also
- Runtime Overview -- how the runtime executes agents
- Execution Context -- the
ctxobject available in tools and workflows - Workflow Envelope -- checkpoint/resume behavior
- Backends -- pluggable runtime backends