waxell.yaml Reference

The waxell.yaml file is the declarative specification for a Waxell agent. It defines what an agent is, what it can do, and what governance constraints it operates under. The runtime reads this file at deploy time (wax push) and uses it to instantiate the agent, register tools, wire workflows, and enforce guards.

If you don't already have the wax CLI: pip install waxell (or pipx install waxell for an isolated install). See CLI reference for Windows PATH issues and troubleshooting.

Minimal example

version: 1

agents:
  - name: my_agent
    version: 1.0.0
    description: A simple agent.
    system_prompt: |
      You are a helpful assistant.
    tools: [my_tool]
    workflows: [my_workflow]

tools:
  - name: my_tool
    description: Does something useful.
    inputs:
      query:
        type: string
    outputs:
      result:
        type: string

workflows:
  - name: my_workflow
    type: graph
    nodes: [step_one]
    edges: []

Full example

version: 1

defaults:
  framework: autogen
  owner: team@example.com
  tags: [production]

agents:
  - name: contract_negotiator
    version: 1.0.3
    description: Negotiates contract terms with vendors.
    display_name: Contract Negotiator
    tags: [contracts, procurement]
    model: openai:gpt-4o-mini
    system_prompt: |
      You are a contract negotiation assistant.
    system_prompt_ref:
      name: contract_negotiator_system
      label: production
    tools: [calculator, clause_lookup]
    workflows: [escalation, negotiation]
    domains: [crm, contracts]
    capabilities: [waxell.hooks]
    timeout_seconds: 90
    max_correction_retries: 2

    llm_config:
      default_model: openai:gpt-4o-mini
      task_routes:
        email:
          models: [openai:gpt-4o-mini, openai:gpt-4o]
          temperature: 0.7
          max_tokens: 2000
      inherit_defaults: true

    guards:
      - type: cost
        limit: $2.00
        scope: tree
        action: abort
      - type: spawn_depth
        limit: 3
        action: abort
      - type: turn_count
        limit: 30
        scope: run
        action: warn

    memory:
      conversation:
        scope: session_id
        type: list
        tier: episodic
        ttl: 30d
        max_items: 100
      scratchpad:
        tier: working
        scope: [run_id]
        auto_capture: true
        ttl: 1h

    signals:
      - name: proposal_received
        source_type: webhook
        schema:
          vendor_id: string
          proposal_id: string
        triggers_workflow: negotiation
        idempotency_key: $.proposal_id

    policies:
      - block_pii_leakage
      - name: require_human_review
        conditions:
          - output_contains_dollar_amount_above(50000)

    mcps:
      - server: github
        tool_allowlist: [search_repos, get_file]
      - server: filesystem
        transport: stdio
        command: npx
        args: [-y, "@anthropic/mcp-filesystem"]

    prompts:
      - name: negotiation_guide
        version_label: production

    metacog:
      triage: true
      plan: true
      reflect: true
      inject_learnings: 5

    hallucination_policy:
      action: retry_then_fail
      max_retries: 1

    telemetry:
      alerts:
        - p95_latency_ms > 5000
        - error_rate > 0.05
      sample_rate: 1.0

tools:
  - name: calculator
    version: 1.0.1
    description: Basic arithmetic.
    tags: [math, utility]
    inputs:
      a:
        type: number
      b:
        type: number
      op:
        type: string
        enum: [add, sub, mul, div]
    outputs:
      result:
        type: number

workflows:
  - name: escalation
    version: 1.0.1
    type: graph
    description: Human-in-the-loop escalation.
    nodes: [triage, draft_response, human_review, send]
    edges:
      - {from: triage, to: draft_response}
      - {from: draft_response, to: human_review}
      - {from: human_review, to: send}

Top-level fields

Field	Type	Required	Default	Description
`version`	`integer`	Yes	-	Spec format version. Currently `1`.
`defaults`	`object`	No	`{}`	Project-wide defaults inherited by all agents/tools in this file.
`agents`	`list[Agent]`	Yes	-	List of agent definitions.
`tools`	`list[Tool]`	No	`[]`	Tool definitions referenced by agents.
`workflows`	`list[Workflow]`	No	`[]`	Workflow definitions referenced by agents.

defaults

Field	Type	Default	Description
`framework`	`string`	-	Default agent framework (e.g., `autogen`).
`owner`	`string`	-	Default owner email.
`tags`	`list[string]`	`[]`	Default tags applied to all agents.

agents

Each entry in the agents list defines one agent. An agent is the top-level unit of deployment.

Core fields

Field	Type	Required	Default	Description
`name`	`string`	Yes	-	Unique agent identifier. Used in URLs, CLI, and traces. Must be a valid Python identifier (letters, digits, underscores).
`version`	`string`	No	-	Semantic version string (e.g., `1.0.3`).
`description`	`string`	No	`""`	Human-readable description. Shown in the controlplane UI.
`display_name`	`string`	No	-	UI-friendly name (can contain spaces, punctuation).
`tags`	`list[string]`	No	`[]`	Classification tags. Merged with `defaults.tags`.
`model`	`string`	No	-	Default LLM model in `provider:model` format (e.g., `openai:gpt-4o-mini`). Overridden by `llm_config` task routes.
`timeout_seconds`	`integer`	No	-	Maximum wall-time for a single agent run.
`max_correction_retries`	`integer`	No	`2`	Cap on the router's retry-with-feedback loop. Set to `0` for expensive single-call agents.

Prompt configuration

Field	Type	Required	Default	Description
`system_prompt`	`string`	No	-	Inline system prompt. Mutually exclusive with `system_prompt_ref` in practice; if both are set, `system_prompt_ref` takes precedence at runtime.
`system_prompt_ref`	`object`	No	-	Reference to a prompt in the prompt registry.
`system_prompt_ref.name`	`string`	Yes	-	Prompt name in the registry.
`system_prompt_ref.label`	`string`	No	`"production"`	Version label. Allows promoting prompts without redeploying.

Bindings

Field	Type	Required	Default	Description
`tools`	`list[string]`	No	`[]`	Tool names this agent can use. Must match entries in the top-level `tools` section or be registered via `wax push-tool`.
`workflows`	`list[string]`	No	`[]`	Workflow names this agent can execute. Must match entries in the top-level `workflows` section or be registered via `wax push-workflow`.
`domains`	`list[string]`	No	`[]`	External domain names (e.g., `crm`, `email`). Resolved against the tenant's registered domains at runtime.
`capabilities`	`list[string]`	No	`[]`	Named capabilities this agent exposes. Each maps to a tool, workflow, or domain action.

Relationships

Field	Type	Required	Default	Description
`signals`	`list[Signal]`	No	`[]`	Inbound event definitions. See signals section.
`memory`	`map[string, Memory]`	No	`{}`	Named memory slots. See memory section.
`policies`	`list`	No	`[]`	Governance policy references. Can be bare strings (policy names) or objects with `name` and `conditions`.
`mcps`	`list[MCPBinding]`	No	`[]`	MCP server bindings. See mcps section.
`prompts`	`list[PromptBinding]`	No	`[]`	Prompt registry bindings. See prompts section.
`guards`	`list[Guard]`	No	`[]`	Declarative governance guards. See guards section.
`llm_config`	`object`	No	-	LLM routing configuration. See llm_config section.
`metacog`	`object`	No	-	Metacognition configuration. See metacog section.
`hallucination_policy`	`string` or `object`	No	-	Hallucinated-tool-call recovery policy. See hallucination_policy section.
`telemetry`	`object`	No	-	Telemetry and alerting configuration. See telemetry section.

tools

Each entry in the top-level tools list defines a tool the agent can call.

Field	Type	Required	Default	Description
`name`	`string`	Yes	-	Unique tool identifier. Referenced by agents in their `tools` list.
`version`	`string`	No	-	Semantic version string.
`description`	`string`	No	`""`	What the tool does. Shown to the LLM as tool documentation.
`tags`	`list[string]`	No	`[]`	Classification tags.
`async_only`	`boolean`	No	`false`	If true, tool must be called asynchronously.
`inputs`	`map`	No	`{}`	Input parameter definitions. See tool inputs.
`outputs`	`map`	No	`{}`	Output field definitions.
`schema`	`object`	No	-	Unified JSON Schema for inputs/outputs. Alternative to separate `inputs`/`outputs`.

Tool inputs

Each key in inputs is a parameter name. The value is an object:

Field	Type	Required	Default	Description
`type`	`string`	No	`"string"`	JSON Schema type: `string`, `number`, `integer`, `boolean`, `array`, `object`.
`description`	`string`	No	-	Parameter description shown to the LLM.
`enum`	`list`	No	-	Allowed values.
`required`	`boolean`	No	`false`	Whether the parameter is required.
`default`	`any`	No	-	Default value if not provided.
`items`	`object`	No	-	For `array` type: schema of array elements.
`format`	`string`	No	-	JSON Schema format hint (e.g., `email`, `uri`).
`pattern`	`string`	No	-	Regex pattern for validation.
`minLength`	`integer`	No	-	Minimum string length.
`maxLength`	`integer`	No	-	Maximum string length.
`minimum`	`number`	No	-	Minimum numeric value.
`maximum`	`number`	No	-	Maximum numeric value.

Example:

tools:
  - name: search
    description: Search a knowledge base.
    inputs:
      query:
        type: string
        description: Search query
        required: true
      limit:
        type: integer
        description: Max results
        default: 10
        minimum: 1
        maximum: 100
    outputs:
      results:
        type: array
      total_count:
        type: integer

workflows

Each entry defines a workflow the agent can execute.

Field	Type	Required	Default	Description
`name`	`string`	Yes	-	Unique workflow identifier.
`version`	`string`	No	-	Semantic version string.
`type`	`string`	No	-	Workflow type. Common values: `graph`, `linear`, `router`.
`description`	`string`	No	`""`	What the workflow does.
`nodes`	`list[string]`	No	`[]`	Node names in the workflow graph.
`edges`	`list[Edge]`	No	`[]`	Edges connecting nodes.
`steps`	`list[Step]`	No	`[]`	Alternative to nodes/edges for linear workflows.
`routes`	`list[Route]`	No	`[]`	Routing rules for decision-based workflows.
`metadata`	`object`	No	`{}`	Arbitrary key-value metadata.
`inputs_schema`	`object`	No	`{}`	JSON Schema for workflow inputs.

Edge format

edges:
  - {from: node_a, to: node_b}
  - {from: node_b, to: node_c, condition: "result.status == 'ok'"}

Step format (linear workflows)

steps:
  - id: step_0
    type: tool        # tool | workflow | decision
    name: my_tool
    inputs:
      query: "$.signal.query"
    produces: sym_0

Field	Type	Required	Description
`id`	`string`	Yes	Unique step identifier.
`type`	`string`	Yes	Step type: `tool`, `workflow`, or `decision`.
`name`	`string`	Yes	Name of the tool/workflow/decision to invoke.
`inputs`	`object`	No	Input mappings. Can reference prior step outputs via `$.` syntax.
`produces`	`string`	No	Symbol name for this step's output. Referenced by later steps.

signals

Signals define inbound events that trigger agent execution.

Field	Type	Required	Default	Description
`name`	`string`	Yes	-	Signal identifier. Used in webhook URLs: `/api/v1/signals/<name>/`.
`source_type`	`string`	No	`"webhook"`	Source type: `webhook`, `queue`, `schedule`, `api`, `internal`.
`description`	`string`	No	-	Human-readable description.
`schema`	`object`	No	`{}`	Expected payload shape. Keys are field names, values are types.
`triggers_workflow`	`string`	No	-	Workflow to run when this signal fires.
`idempotency_key`	`string`	No	-	JSONPath expression for deduplication (e.g., `$.proposal_id`).
`ttl_seconds`	`integer`	No	-	Time-to-live for the signal in the queue.
`priority`	`integer`	No	`0`	Processing priority (higher = sooner).
`tags`	`list[string]`	No	`[]`	Classification tags.

Example:

signals:
  - name: document_uploaded
    source_type: webhook
    description: Triggered when a user uploads a document
    schema:
      document_id: string
      filename: string
      size_bytes: number
    triggers_workflow: process_document
    idempotency_key: $.document_id
    ttl_seconds: 3600

guards

Guards are declarative governance constraints enforced at runtime. They limit cost, depth, fanout, and other execution dimensions.

Field	Type	Required	Default	Description
`type`	`string`	Yes	-	Guard type. See table below.
`limit`	`string`, `number`	Yes	-	Threshold. Format depends on `type`.
`scope`	`string`	No	`"run"`	Scope: `run` (single execution) or `tree` (including spawned children).
`action`	`string`	No	`"abort"`	What happens when the limit is hit: `abort`, `warn`, `require_approval`.
`tool`	`string`	Conditional	-	Required when `type` is `tool_call_count`. Specifies which tool to count.

Guard types

Type	Limit format	Description
`cost`	`"$2.00"` or `2.0`	Total LLM cost in USD.
`token_count`	`integer`	Total tokens consumed.
`turn_count`	`integer`	Number of LLM turns.
`spawn_depth`	`integer`	Max depth of spawned child agents.
`spawn_fanout`	`integer`	Max number of concurrent child agents.
`spawn_concurrent`	`integer`	Max simultaneously running child agents.
`tool_call_count`	`integer`	Max calls to a specific tool (requires `tool` field).
`wall_time`	`"30s"`, `"5m"`, `"1h"`	Max wall-clock time for the execution.

Example:

guards:
  - type: cost
    limit: $2.00
    scope: tree
    action: abort
  - type: spawn_depth
    limit: 3
    action: abort
  - type: turn_count
    limit: 30
    scope: run
    action: warn
  - type: tool_call_count
    tool: propose_node
    limit: 15
    scope: run
    action: warn
  - type: wall_time
    limit: 5m
    action: abort

memory

Named memory slots scoped to different dimensions and persistence tiers.

Field	Type	Required	Default	Description
`tier`	`string`	No	`"episodic"`	Memory tier: `working`, `session`, `episodic`, `semantic`.
`scope`	`string` or `list[string]`	No	`"tenant_id"`	Isolation boundary. Single dimension or list of dimensions. `tenant_id` is always enforced automatically.
`type`	`string`	No	`"dict"`	Data structure: `dict`, `list`, `value`.
`description`	`string`	No	`""`	Human-readable description.
`ttl`	`string`	No	-	Time-to-live: `"1h"`, `"30d"`, `"7d"`, etc.
`max_items`	`integer`	No	-	For `list` type: max items before oldest are evicted.
`custom_scope_key`	`string`	No	-	When scope is `"custom"`, the input field to use as the key.
`auto_capture`	`boolean`	No	`true` (working)	Working tier: auto-store every tool result.
`reference_syntax`	`string`	No	`"$ref"`	Working tier: prefix for LLM `$ref:name.N` references.
`max_size_mb`	`integer`	No	`20`	Working tier: soft cap on scratchpad size per run.
`searchable`	`boolean`	No	`true` (semantic)	Semantic tier: enable vector embedding and retrieval.
`embedding_model`	`string`	No	-	Semantic tier: override default embedding model.
`schema`	`string`	No	-	Fully-qualified Python class for typed validation (e.g., `myapp.schemas.NoteSchema`).
`schema_version`	`integer`	No	`1`	Schema version for migration support.

Memory tiers

Tier	Backend	Lifecycle	Use case
`working`	Redis	Per-run, cleared on completion	Scratchpad for in-flight tool results.
`session`	Postgres	24h default TTL	Conversation/plan state within a user session.
`episodic`	Postgres	TTL-based	Cross-run state: conversation history, cached results.
`semantic`	pgvector	Indefinite	Long-term facts with semantic search.

Scope dimensions

Dimension	Source	Description
`tenant_id`	Always	Outermost boundary (automatic).
`agent`	Agent name	Per-agent within tenant.
`agent_version`	Agent version	Per-agent version.
`user_id`	Sub-user identity	Per end-user.
`user_group`	Sub-user identity	Per user group.
`user_email`	Sub-user identity	Per user email.
`session_id`	Conversation session	Per conversation session.
`workflow`	Workflow name	Per workflow.
`run_id`	Execution run	Per workflow execution (ephemeral).
`channel_id`	Slack/chat	Per chat channel.
`thread_ts`	Slack	Per thread.

Example:

memory:
  # Working memory — auto-captures tool results
  scratchpad:
    tier: working
    scope: [run_id]
    auto_capture: true
    ttl: 1h

  # Episodic — per-user conversation history
  conversation:
    tier: episodic
    scope: user_id
    type: list
    ttl: 30d
    max_items: 100

  # Semantic — long-term facts with vector search
  client_knowledge:
    tier: semantic
    scope: [user_id, portfolio_id]
    searchable: true
    description: Facts about this client relationship

llm_config

Per-agent LLM routing configuration. Controls which models handle which task types.

Field	Type	Required	Default	Description
`default_model`	`string`	No	-	Default model when no task-specific route matches. Format: `provider:model`.
`task_routes`	`map[string, TaskRoute]`	No	`{}`	Per-task routing rules.
`inherit_defaults`	`boolean`	No	`true`	Whether unmatched tasks fall through to tenant/global defaults.
`custom_task_types`	`list[string]`	No	`[]`	Additional task type names beyond the built-in set.

TaskRoute

Field	Type	Required	Default	Description
`models`	`list[string]`	Yes	-	Ordered model list: primary + fallbacks.
`temperature`	`number`	No	-	Temperature override for this task.
`max_tokens`	`integer`	No	-	Token limit override for this task.

Example:

llm_config:
  default_model: openai:gpt-4o-mini
  task_routes:
    email:
      models: [openai:gpt-4o-mini, openai:gpt-4o]
      temperature: 0.7
      max_tokens: 2000
    classify:
      models: [openai:gpt-4o-mini]
      temperature: 0.1
    research:
      models: [anthropic:claude-sonnet-4-5-20250929]
      temperature: 0.3
      max_tokens: 4000
  inherit_defaults: true

mcps

MCP (Model Context Protocol) server bindings. Each binding declares that the agent uses a registered MCP server.

Field	Type	Required	Default	Description
`server`	`string`	Yes	-	MCP server slug. Must be registered in the tenant's MCPServer registry.
`tool_allowlist`	`list[string]`	No	`[]`	Only expose these tools from the server. Empty = all tools.
`tool_blocklist`	`list[string]`	No	`[]`	Hide these tools from the agent.
`tool_descriptions`	`map[string, string]`	No	`{}`	Agent-specific tool descriptions (override the server's defaults).
`enabled`	`boolean`	No	`true`	Whether this binding is active.
`transport`	`string`	No	`"http"`	Transport type: `http` or `stdio`.
`command`	`string`	Conditional	-	Required for `stdio` transport. Command to launch the MCP server process.
`args`	`list[string]`	No	`[]`	Arguments for the stdio command.
`env`	`map[string, string]`	No	`{}`	Environment variables for the stdio process.
`sandbox`	`object`	No	defaults	Sandbox configuration for stdio processes.

Sandbox (stdio only)

Field	Type	Default	Description
`fs_allow`	`list[string]`	`[]`	Read-only filesystem paths the process can access.
`net`	`boolean`	`false`	Allow network egress.
`cpu_limit`	`number`	`0.5`	CPU-seconds/sec soft cap.
`mem_limit_mb`	`integer`	`256`	RSS memory cap in MB.
`max_duration_seconds`	`integer`	`300`	Hard wall-time for the session.

Example:

mcps:
  # HTTP transport — server registered in tenant MCPServer registry
  - server: github
    tool_allowlist: [search_repos, get_file, list_issues]

  # stdio transport — launches a local process
  - server: filesystem
    transport: stdio
    command: npx
    args: [-y, "@anthropic/mcp-filesystem"]
    env:
      HOME: /tmp/sandbox
    sandbox:
      fs_allow: [/data/documents]
      net: false
      mem_limit_mb: 128
      max_duration_seconds: 60

prompts

Prompt registry bindings. Each binding links the agent to a versioned prompt.

Field	Type	Required	Default	Description
`name`	`string`	Yes	-	Prompt name in the registry.
`version_label`	`string`	No	`"production"`	Version label pin. Allows promoting prompts without redeploying.
`order`	`integer`	No	`0`	Injection order when multiple prompts are bound.
`enabled`	`boolean`	No	`true`	Whether this binding is active.

Example:

prompts:
  - name: sales_playbook
    version_label: production
  - name: compliance_rules
    version_label: production
    order: 1

metacog

Per-agent metacognition configuration. Enables conditional planning (triage), multi-step planning, and reflexion (learning from outcomes).

Field	Type	Required	Default	Description
`triage`	`boolean`	No	`true`	Run a triage step before each dispatch to classify complexity.
`plan`	`boolean`	No	`true`	Generate a multi-step plan on hard turns.
`reflect`	`boolean`	No	`true`	Run reflexion at workflow exit to generate learnings.
`inject_learnings`	`integer`	No	`5`	Top-N learnings from prior runs injected into prompts. Set to `0` to disable.
`reflect_on_trivial`	`boolean`	No	`false`	Reflect on trivial turns too (not just hard turns or failures).
`triage_prompt`	`string`	No	built-in	Override the triage prompt template.
`plan_prompt`	`string`	No	built-in	Override the plan prompt template.
`reflect_prompt`	`string`	No	built-in	Override the reflect prompt template.
`triage_max_tokens`	`integer`	No	`200`	Token cap for triage calls.
`plan_max_tokens`	`integer`	No	`800`	Token cap for plan calls.
`reflect_max_tokens`	`integer`	No	`300`	Token cap for reflect calls.

Example:

metacog:
  triage: true
  plan: true
  reflect: true
  inject_learnings: 5
  reflect_on_trivial: false
  triage_max_tokens: 200

hallucination_policy

Controls how the runtime recovers when an LLM hallucinates a tool call (calls a tool that doesn't exist).

String shortcut: Use a bare action name for the common case.

hallucination_policy: retry_then_fail

Object form: Full control.

Field	Type	Default	Description
`action`	`string`	`"retry_then_fail"`	Recovery action. See table below.
`max_retries`	`integer`	`1`	Max retries before the action escalates.

Actions

Action	Description
`retry_then_fail`	Retry with feedback, then fail if the model hallucinates again.
`retry_then_ask_user`	Retry with feedback, then escalate to the user.
`fail_immediately`	Fail immediately on the first hallucination.
`warn_only`	Log a warning but continue execution.
`ask_user`	Alias for `retry_then_ask_user`.

Example:

hallucination_policy:
  action: retry_then_fail
  max_retries: 2

telemetry

Agent-level telemetry and alerting configuration.

Field	Type	Required	Default	Description
`alerts`	`list[string]`	No	`[]`	Alert rules as condition expressions.
`sample_rate`	`number`	No	`1.0`	Trace sampling rate (0.0 to 1.0).

Example:

telemetry:
  alerts:
    - p95_latency_ms > 5000
    - error_rate > 0.05
    - cost_per_run > 0.50
  sample_rate: 1.0

policies

Agent-level governance policy references. Can be bare strings (referencing policies by name) or objects with conditions.

String form:

policies:
  - block_pii_leakage
  - require_human_review

Object form:

policies:
  - name: require_human_review
    conditions:
      - output_contains_dollar_amount_above(50000)
      - vendor_tier == 'tier_1'

Policies are resolved against the tenant's policy registry at runtime. Policy categories include: rate-limit, budget, safety, kill, audit, operations, quality, control, llm, scheduling.

Minimal example​

Full example​

Top-level fields​

defaults​

agents​

Core fields​

Prompt configuration​

Bindings​

Relationships​

tools​

Tool inputs​

workflows​

Edge format​

Step format (linear workflows)​

signals​

guards​

Guard types​

memory​

Memory tiers​

Scope dimensions​

llm_config​

TaskRoute​

mcps​

Sandbox (stdio only)​

prompts​

metacog​

hallucination_policy​

Actions​

telemetry​

policies​

See also​

Minimal example

Full example

Top-level fields

defaults

agents

Core fields

Prompt configuration

Bindings

Relationships

tools

Tool inputs

workflows

Edge format

Step format (linear workflows)

signals

guards

Guard types

memory

Memory tiers

Scope dimensions

llm_config

TaskRoute

mcps

Sandbox (stdio only)

prompts

metacog

hallucination_policy

Actions

telemetry

policies

See also