glossary

Waxell Glossary: AI Agent Observability & Governance

Waxell Glossary: AI Agent Observability & Governance

A reference for engineering teams building and running AI agents in production. Every term below reflects how the field actually uses these concepts and where applicable, how Waxell implements them.

Foundation
AI Agent

An AI agent is a software system that uses a large language model (LLM) as its reasoning core, enabling it to perceive inputs, make decisions, take actions through tools, and pursue goals across multiple steps — without requiring explicit human instruction at each step.

The defining characteristic of an AI agent, as opposed to a simple chatbot or API wrapper, is autonomy: the agent decides what to do next based on its current state and objectives, rather than executing a predetermined script. This autonomy is what makes agents powerful — and what makes production governance necessary.

In Waxell: Waxell is built specifically for production AI agents — systems that take autonomous actions with real-world consequences — rather than for simple chat interfaces or single-turn completions.

Related: Agentic AI, Agentic Loop, Production Agent, Agentic Governance

Agentic AI / Agentic System

Agentic AI refers to the class of AI systems designed to act autonomously over extended periods — executing multi-step workflows, invoking tools, maintaining state across interactions, and making decisions without constant human input.

The shift from AI as a tool (you ask, it answers) to AI as an agent (you define a goal, it executes) represents a fundamental change in how AI interacts with software systems, data, and the real world. Agentic systems can write and run code, browse the web, send emails, call APIs, and take actions with irreversible real-world effects.

In Waxell: Waxell is built for the agentic era — providing the observability and governance infrastructure that makes agentic AI systems safe to operate in production environments.

Related: AI Agent, Agentic Loop, Agentic Governance, Production Agent

Agentic Loop (Reasoning Loop)

The agentic loop (also called the reasoning loop) is the core execution cycle of an AI agent — a repeating sequence in which the agent receives a context, reasons about the next action to take, executes that action (often via a tool call), observes the result, and updates its context before reasoning again.

The loop continues until the agent reaches a stopping condition: completing its task, hitting a defined limit, or encountering an error. Understanding the loop is foundational to understanding how agents fail — runaway loops, poor stopping conditions, and compounding errors are all failure modes rooted in how the loop is designed and governed.

In Waxell: Every iteration of an agentic loop generates structured observability data in Waxell — capturing the reasoning step, tool call, result, and token cost of each cycle in a queryable execution record.

Related: AI Agent, Trace / Tracing, Execution, Rate Limiting

Agentic Governance

Agentic governance is the set of practices, policies, and infrastructure used to ensure that AI agents operating autonomously remain within intended boundaries — doing what they are supposed to do, not more, and producing actions that are auditable and reversible when necessary.

Governance for AI agents is fundamentally different from governance for traditional software. Traditional software does exactly what it is coded to do; AI agents do what they infer is correct from context. This gap between intended behavior and actual behavior — and the need to close it in production — is what agentic governance addresses.

In Waxell: Waxell is an agentic governance platform: its core function is to make autonomous AI agents auditable, controllable, and safe to operate at scale — through runtime policy enforcement, execution tracing, and cost controls.

Related: Runtime Governance, Governance Plane, Policy Enforcement, Audit Trail

Multi-Agent System

A multi-agent system is an architecture in which multiple AI agents — each with specialized roles, tools, or knowledge — collaborate to complete tasks that would be difficult or impossible for a single agent to handle alone.

Multi-agent systems are increasingly common for complex workflows: one agent plans, another executes code, another searches the web, another validates outputs. The coordination benefits are significant, but so are the observability and governance challenges — errors compound, costs multiply, and tracing behavior across agents requires infrastructure that doesn't come with most AI frameworks.

In Waxell: Waxell supports hierarchical session grouping and cross-agent trace correlation, making it possible to observe and govern multi-agent workflows as unified systems rather than disconnected executions.

Related: Orchestrator / Orchestration, Session Grouping, Trace / Tracing, Agentic Governance

Orchestrator / Orchestration

An orchestrator in a multi-agent system is a coordinating agent (or system) responsible for decomposing a high-level goal into subtasks, assigning those subtasks to specialized sub-agents, managing the flow of information between them, and synthesizing their outputs into a final result.

Orchestration patterns vary — some use a central controller that directs all other agents; others use peer-to-peer handoffs; others use event-driven message passing. In all cases, the orchestrator layer is where governance is most important, because it determines what each sub-agent does and in what order.

In Waxell: Waxell instruments at the orchestration layer, capturing the full coordination graph of a multi-agent workflow — which agent called which, with what inputs, and with what result.

Related: Multi-Agent System, Session Grouping, Trace / Tracing, Agent Runtime

Tool (AI Agent Tool)

In the context of AI agents, a tool is any external function or capability that an agent can invoke to take an action or retrieve information beyond what the model alone can produce — including web search, code execution, database queries, API calls, file operations, and more.

Tools are what give AI agents their real-world impact. A model without tools can only generate text; a model with tools can search the internet, modify data, send messages, and execute code. This expanded capability is also expanded risk — every tool invocation is an opportunity for an agent to take an action with real-world consequences.

In Waxell: Waxell tracks and governs every tool invocation — logging what was called, with what inputs, and what result was returned — and evaluating each call against active policies before execution completes.

Related: Tool Invocation, Policy Enforcement, Model Context Protocol (MCP), Audit Trail

Memory (Agent Memory)

Agent memory is the mechanism by which an AI agent retains and retrieves information across steps within a single execution (short-term memory) or across separate executions over time (long-term memory). Memory allows agents to build on previous reasoning, reference past context, and learn from earlier interactions.

Memory is a significant governance surface: what an agent remembers affects what it does. Persistent memory stores may accumulate sensitive data, stale information, or content introduced through earlier prompt injection attacks. Governing what flows into and out of memory stores is an important part of comprehensive agentic governance.

In Waxell: Waxell's signal domain controls extend to memory operations, providing visibility into what data agents read from and write to persistent memory stores.

Related: Context Window, Session Grouping, PII Scanning, Signal Domain

Context Window

The context window is the maximum amount of text (measured in tokens) that a language model can process in a single call — encompassing the system prompt, conversation history, tool outputs, retrieved documents, and any other content provided as input to the model.

Context window management is a critical optimization problem in production AI agents. Exceeding the context limit causes errors or truncation; approaching it wastes tokens on content the model may not need. Effective context management — knowing what to include, what to summarize, and what to discard — directly affects both agent performance and operating cost.

In Waxell: Waxell's LLM call tracking captures prompt token counts for every model call, giving teams the data they need to understand context window utilization and optimize prompt structure.

Related: Token Budget, LLM Call Tracking, Memory (Agent Memory)

Production Agent

A production agent is an AI agent deployed in a live environment where it interacts with real users, real data, and real external systems — as opposed to a prototype or demo agent running in isolation with controlled inputs.

The gap between a prototype agent and a production agent is larger than it appears. Prototype agents run in controlled conditions with expected inputs; production agents encounter unexpected inputs, adversarial users, API failures, edge cases, and real cost. The infrastructure required to run agents safely in production — observability, governance, cost controls, audit trails — is exactly what Waxell provides.

In Waxell: Waxell is built for production agents: the Observe SDK, governance plane, and policy engine are designed for the conditions of real deployments, not demo environments.

Related: Agent Runtime, Agentic Governance, Agent Observability, Kill Switch

Observability
Agent Observability

Agent observability is the discipline of instrumenting AI agent systems to make their internal behavior externally visible — capturing structured data about every LLM call, tool invocation, decision point, and output so that operators can understand, debug, and improve agent performance in production.

Traditional software observability (metrics, logs, traces) was designed for deterministic systems where the same input always produces the same output. AI agents are non-deterministic: the same input can produce different outputs, and failure modes are often emergent rather than predictable. Agent observability requires a new approach — one designed for the stochastic, multi-step, tool-using nature of modern AI systems.

In Waxell: Waxell Observe is a production-grade agent observability platform — providing structured execution data, LLM call tracking, cost analytics, and session-level trace views for any Python AI agent.

Related: Telemetry, Trace / Tracing, LLM Call Tracking, Auto-Instrumentation

Telemetry

Telemetry is the automated collection, transmission, and storage of operational data from a running system — in the context of AI agents, this includes LLM call metrics, tool invocation data, token counts, latency measurements, error rates, and cost data collected without manual intervention.

Telemetry is the raw material of observability. Without it, operators are blind to what their agents are doing in production. Good telemetry is structured (queryable, not just logged), complete (capturing all relevant events, not just errors), and low-overhead (adding minimal latency or cost to agent execution).

In Waxell: Waxell automatically collects structured telemetry from instrumented agents — routing it to the Waxell dashboard and optionally exporting it as OpenTelemetry (OTel) spans to existing observability infrastructure.

Related: Agent Observability, OpenTelemetry (OTel) Span, LLM Call Tracking, Auto-Instrumentation

Logging

Logging in AI agent systems is the practice of recording discrete events during agent execution — including model calls, tool invocations, decision points, errors, and policy evaluations — as structured, timestamped records that can be queried, filtered, and retained for debugging and compliance purposes.

Log quality is a significant differentiator in production AI systems. Unstructured logs (raw text from print statements) are nearly impossible to query at scale. Structured logs — with consistent fields for agent ID, session ID, event type, inputs, outputs, and metadata — enable the programmatic analysis required to debug complex, multi-step agent failures.

In Waxell: Waxell generates structured execution logs for every instrumented agent run — organized by session, attributable to specific agent versions, and exportable to external log management platforms.

Related: Telemetry, Audit Trail, Agent Observability, Trace / Tracing

Monitoring

Monitoring in the context of AI agents is the continuous observation of agent behavior and system health in production — tracking metrics like error rates, latency distributions, token consumption, cost per execution, and policy violation frequency over time.

Monitoring differs from logging in orientation: logs capture individual events; monitoring tracks aggregate patterns and trends. An individual failed tool call appears in a log; a rising error rate trend appears in monitoring. Both are necessary — logs for debugging specific incidents, monitoring for detecting systemic issues before they escalate.

In Waxell: Waxell provides real-time monitoring views across all instrumented agents — surfacing cost trends, error rates, and usage patterns in the dashboard, with alerting support for threshold violations.

Related: Logging, Alerting, Telemetry, Agent Observability

Alerting

Alerting in AI agent systems is the automated notification of operators when a monitored metric crosses a defined threshold — such as cost per session exceeding a limit, error rate spiking above a baseline, token consumption approaching a budget, or a policy violation being triggered.

Effective alerting is the bridge between monitoring and response. Without it, operators must watch dashboards continuously to catch problems. Well-configured alerts surface anomalies automatically — allowing teams to detect runaway agents, cost spikes, or policy breaches in time to take corrective action.

In Waxell: Waxell's monitoring capabilities include configurable alerting on cost thresholds, error rates, and policy violation frequencies — notifying teams through the dashboard and integrated channels when agent behavior requires attention.

Related: Monitoring, Kill Switch, Rate Limiting, Token Budget

Sampling

Sampling in AI agent observability is the practice of capturing a representative subset of agent executions for detailed analysis — rather than recording full telemetry for every single execution — to manage storage costs and processing overhead at high volumes.

Sampling involves a trade-off: lower sampling rates reduce infrastructure costs but may cause rare failure modes to go undetected. Intelligent sampling strategies — such as always capturing error cases, capturing a percentage of normal executions, and always capturing high-value sessions — allow teams to balance cost and coverage effectively.

In Waxell: Waxell supports configurable sampling strategies, including always-capture for error conditions and policy violations, ensuring that the most operationally significant executions are always fully recorded regardless of sampling rate.

Related: Telemetry, Logging, Agent Observability, Audit Trail

Trace / Tracing

A trace is the complete record of a single agent execution — a structured log of every step the agent took, in order, including all model calls, tool invocations, inputs, outputs, latency, and token counts.

Traces are the primary debugging tool for AI agents. When an agent produces a wrong output, a trace lets you walk through the exact reasoning chain and identify where it went wrong. For multi-agent systems, a trace spans the full parent-child execution tree rather than a single agent's activity.

In Waxell: Every agent execution instrumented with Waxell Observe generates a full distributed trace, viewable in the Waxell dashboard with step-by-step replay capability.

Related: Execution, Session Grouping, LLM Call Tracking, OpenTelemetry Span

LLM Call Tracking

LLM call tracking is the practice of capturing structured data about every request sent to a language model — including the model name, prompt, completion, token count (input and output), latency, cost, and any error responses.

LLM calls are the primary cost driver and performance bottleneck in most AI agent systems. Tracking them individually allows teams to understand exactly where tokens are being spent, which prompts are generating expensive outputs, and which model configurations produce the best cost-to-quality ratio.

In Waxell: Waxell automatically intercepts and logs LLM calls from OpenAI, Anthropic, LiteLLM, Groq, HuggingFace, and LangChain without any manual instrumentation.

Related: Token Budget, Cost Control, Auto-Instrumentation

Auto-Instrumentation

Auto-instrumentation is the ability to add observability to an existing codebase without manually inserting logging or tracking code at every LLM call or tool invocation — typically achieved through patching SDK clients at import time.

Manual instrumentation is fragile and time-consuming: it requires developers to remember to log every relevant event, keep instrumentation code up to date as the agent evolves, and handle edge cases like streaming or errors. Auto-instrumentation removes this burden entirely.

In Waxell: Waxell Observe auto-instruments OpenAI, Anthropic, LiteLLM, Groq, and HuggingFace SDK clients. Import the library and every subsequent LLM call is tracked — no decorators, no wrappers, no code changes required.

Related: LLM Call Tracking, Observe SDK, Agent Observability

Session Grouping

Session grouping is the practice of associating multiple related agent executions under a single logical session — so that a complex, multi-step, or multi-agent workflow can be traced and analyzed as a unified unit rather than as disconnected calls.

Without session grouping, each LLM call appears in isolation. A five-step research agent that calls the model ten times generates ten separate records with no visible relationship. Session grouping reconstructs the narrative: this set of calls was this agent, working on this task, for this user.

In Waxell: Waxell supports hierarchical session grouping with parent-child relationships, making it possible to trace full multi-agent workflows from the top-level task down to individual tool calls.

Related: Trace / Tracing, Execution, Multi-Agent System

OpenTelemetry (OTel) Span

An OpenTelemetry (OTel) span is a single unit of work in a distributed trace — in AI agent systems, this typically corresponds to one LLM call, one tool invocation, or one step in an agent's reasoning loop — captured in the OpenTelemetry standard format.

OpenTelemetry is an open-source observability standard originally built for distributed software systems. Its adoption in AI tooling means that observability data from AI agents can be exported to any OTel-compatible backend — including Datadog, Grafana, Jaeger, and others — without vendor lock-in.

In Waxell: Waxell exports full OTel spans, enabling teams to route agent observability data to existing infrastructure alongside their application observability stack.

Related: Telemetry, Trace / Tracing, Agent Observability

Governance & Control
Runtime Governance

Runtime governance is the enforcement of rules and policies on AI agent behavior at the moment of execution — not before it runs (at configuration time) and not after it runs (at audit time), but during the live execution itself.

The distinction matters because many governance approaches are reactive: they log what happened, then investigate. Runtime governance is preventive: it intercepts agent actions before they complete and blocks, modifies, or escalates anything that falls outside permitted behavior. This is the only approach that can prevent a misbehaving agent from causing harm in real time.

In Waxell: Waxell's policy enforcement runs inline with agent execution — every tool call, model request, and content output can be evaluated against active policies before it proceeds.

Related: Governance Plane, Policy Enforcement, Kill Switch, Agentic Governance

Governance Plane

The governance plane is the infrastructure layer that sits between an AI agent and the outside world — intercepting agent actions, evaluating them against active policies, enforcing rules, and producing audit records of every decision made.

In traditional software, governance lives in code: engineers write validation logic, error handlers, and access controls directly into applications. For AI agents, this approach breaks down because agent behavior is dynamic and not fully predictable at development time. A governance plane externalizes these controls, allowing teams to update rules without redeploying agent code.

In Waxell: The Waxell governance plane is pluggable — it connects to existing agents via the Observe SDK (Phase A) and the full runtime layer (Phase B), without requiring changes to the agents themselves.

Related: Runtime Governance, Policy Enforcement, Pluggable Runtime

Policy Enforcement

Policy enforcement in the context of AI agents is the automated application of rules that define what agents are permitted to do — including which tools they can invoke, what content they can output, how much they can spend, and how they should behave when encountering edge cases.

Policies are distinct from prompts. A prompt asks an agent to behave a certain way; a policy enforces that it does. Policy enforcement is the mechanism that turns intentions into guarantees.

In Waxell: Waxell supports eleven policy categories: Audit, Kill, Rate-Limit, Content, LLM, Safety, Control, Operations, Scheduling, Cost, and Quality — each enforceable at the runtime layer without modifying agent code.

Related: Runtime Governance, Governance Plane, Guardrails, Kill Switch

Guardrails

Guardrails are constraints applied to AI agent behavior to prevent outputs or actions that fall outside acceptable boundaries — such as generating harmful content, accessing prohibited data, or taking irreversible actions without human approval.

The term is often used loosely to describe any form of behavioral constraint applied to AI systems. In the context of production agents, effective guardrails must operate at runtime (not just at prompt design time), be auditable, and be adaptable without requiring agent code changes.

In Waxell: Waxell implements guardrails as enforceable runtime policies — not suggestions embedded in prompts, but active controls that intercept and evaluate agent actions before they complete.

Related: Policy Enforcement, Safety Policy, PII Scanning, Prompt Injection Defense

Kill Switch

A kill switch in an AI agent system is a mechanism that allows an operator to immediately halt a running agent — stopping all further actions, tool calls, and model requests — without waiting for the agent to reach a natural stopping point.

Kill switches are an essential safety control for production agents. If an agent enters a runaway loop, generates unexpected costs, or begins taking harmful actions, operators need the ability to stop it immediately. A kill switch that requires code deployment or manual intervention is not fast enough.

In Waxell: Waxell's Kill policy category enables operators to halt any agent immediately from the dashboard or via API — without touching the agent's codebase or redeploying anything.

Related: Runtime Governance, Policy Enforcement, Rate Limiting

Rate Limiting

Rate limiting in AI agent systems is the practice of capping how frequently an agent can invoke a tool, make a model call, or perform a specific action — within a defined time window — to prevent runaway loops, resource exhaustion, or API abuse.

A common failure mode in production agents is the infinite loop: an agent repeatedly calls a tool or model because its stopping condition is never satisfied. Without rate limiting, this can exhaust API quotas, trigger provider-side bans, or generate unexpected costs within minutes.

In Waxell: Waxell's Rate-Limit policy category allows teams to configure invocation frequency limits per agent, per tool, per user, or per session — enforced at the runtime layer before the call is made.

Related: Kill Switch, Circuit Breaker, Token Budget

Circuit Breaker

A circuit breaker in an AI agent system is an automatic control that temporarily stops an agent from making further calls to a failing or degraded external service — preventing cascading failures and protecting downstream systems.

Borrowed from distributed systems engineering, the circuit breaker pattern is important for production AI agents that depend on external APIs (databases, web search, custom tools). If a downstream service begins returning errors or timing out, a circuit breaker trips and suspends the agent's ability to call that service until it recovers.

In Waxell: Waxell's Operations policy category includes circuit breaker support, alongside timeouts and retry configuration — giving teams control over failure behavior without modifying agent code.

Related: Rate Limiting, Policy Enforcement, Production Agent

Human-in-the-Loop (HITL)

Human-in-the-loop (HITL) is a design pattern in which a human is required to review, approve, or intervene at defined points during an AI agent's execution — rather than allowing the agent to proceed autonomously through all steps.

HITL is not a limitation; it is a deliberate control mechanism. For high-stakes or irreversible actions (sending an email, executing a financial transaction, modifying a database), requiring human confirmation before the agent proceeds is a responsible design choice. The challenge is implementing HITL without breaking the agent's execution state or requiring complex application logic.

In Waxell: Waxell's Control policy category supports configurable escalation paths — pausing agent execution at defined checkpoints and routing approval requests to designated humans before the agent resumes.

Related: Policy Enforcement, Runtime Governance, Kill Switch

Safety & Compliance
Audit Trail

An audit trail in an AI agent system is an immutable, timestamped record of every action the agent took — including every model call, tool invocation, input received, output produced, policy decision enforced, and cost incurred.

Audit trails are required for compliance in regulated industries and essential for debugging in any context. An effective audit trail is not just a log — it is a tamper-resistant, structured record that can demonstrate what the agent did and why at any point in the past.

In Waxell: Waxell's Audit policy category creates a complete, durable execution record for every agent run. Audit records are structured, queryable, and exportable for compliance reporting.

Related: Execution, Telemetry, Agentic Governance

PII Scanning

PII scanning in AI agent systems is the automated detection and handling of personally identifiable information (PII) — such as names, email addresses, phone numbers, or financial data — in the content that flows into or out of an AI agent.

AI agents frequently process user-generated content that contains PII. Without scanning, that data can be inadvertently sent to third-party LLM providers, stored in logs, or returned in outputs — creating privacy risks and compliance violations. PII scanning intercepts this data at the point of transit and either redacts, masks, or blocks it based on configured policy.

In Waxell: Waxell's Content policy category includes PII scanning on both inputs and outputs, enforced at the runtime layer before data reaches the model or is returned to the user.

Related: Content Policy, Guardrails, Audit Trail

Content Policy

A content policy in an AI agent system is a configurable rule set that governs what types of content the agent is allowed to process as input and produce as output — including restrictions on PII, hate speech, prompt injection attempts, off-topic content, or any other defined category.

Content policies are applied at the runtime layer, not at prompt design time. This distinction is important: a well-written system prompt may reduce the likelihood of bad outputs, but it does not guarantee them. A content policy intercepts and evaluates actual inputs and outputs, enforcing boundaries that prompts alone cannot guarantee.

In Waxell: Waxell's Content policy category allows teams to configure scanning rules for both directions of data flow — input from users and output from the agent — with block, redact, and alert actions available.

Related: PII Scanning, Guardrails, Prompt Injection Defense

Safety Policy

A safety policy in an AI agent system defines the behavioral boundaries that an agent must operate within — specifying what actions are permitted, what actions are prohibited, and how the agent should respond when it encounters a scenario outside its intended scope.

Safety policies codify the intent behind an agent's design into enforceable runtime rules. Rather than relying on the model to generalize from a system prompt, safety policies create explicit constraints that apply regardless of what the user inputs or what the model generates.

In Waxell: Waxell's Safety policy category allows teams to define behavioral boundaries at the runtime layer — separate from prompt engineering, enforced consistently across all agent invocations.

Related: Guardrails, Policy Enforcement, Content Policy

Prompt Injection / Prompt Injection Defense

Prompt injection is a class of attack in which a malicious actor embeds instructions in content that the AI agent reads — such as a webpage, document, email, or tool output — with the goal of hijacking the agent's behavior and causing it to take unauthorized actions.

As AI agents are given access to external data sources (web browsers, email clients, document stores), prompt injection becomes a critical production risk. An agent instructed to summarize a document can be tricked by hidden instructions embedded in that document that redirect the agent's behavior entirely.

In Waxell: Waxell's Content policy category includes prompt injection detection, scanning tool outputs and retrieved content for injection patterns before the agent processes them.

Related: Content Policy, Guardrails, Safety Policy

Cost & Performance
Token Budget

A token budget is an explicit limit on the number of tokens — the fundamental unit of LLM computation — that an AI agent is permitted to consume within a defined scope, such as a single session, a user account, or a time period.

Tokens translate directly to cost. An agent without a token budget can consume thousands of dollars in API spend if it enters a loop, processes unexpectedly large inputs, or is exploited by adversarial users. A token budget converts an unbounded cost exposure into a predictable, manageable expense.

In Waxell: Waxell's Budget and LLM policy categories allow teams to set token ceilings at any granularity — per agent, per user, per session, or globally — with configurable behavior when the budget is reached (halt, alert, or degrade gracefully).

Related: Cost Control, LLM Call Tracking, Rate Limiting

Cost Control / Agent Cost Optimization

Cost control in AI agent systems is the practice of monitoring, limiting, and optimizing the spend incurred by AI agents making calls to LLM providers and external APIs — through a combination of real-time tracking, token budgets, model selection policies, and usage limits.

Agent cost is not predictable by default. Unlike a traditional API that charges per request, LLM costs scale with token count — and token count is determined by the content flowing through the agent at runtime, not by the developer at design time. Effective cost control requires visibility into per-call costs and the ability to enforce limits before spend occurs.

In Waxell: Waxell's Cost policy category enforces spending limits per agent, per user, and per session, while the Observe SDK tracks per-call costs in real time across all supported providers.

Related: Token Budget, LLM Call Tracking, Agent Budget

Agent Budget

An agent budget is a configured set of resource limits — including token spend, API call count, wall-clock time, and monetary cost — that defines the maximum resources a specific agent is permitted to consume during a single execution or within a defined period.

Budgets differ from rate limits in scope: a rate limit controls frequency (calls per minute), while a budget controls total consumption (tokens or dollars per session). Both are necessary for responsible production operation.

In Waxell: Waxell treats budgets as explicit, named configurations — not emergent defaults. Each agent in the Waxell registry can be assigned a budget that is enforced at runtime and visible in the dashboard.

Related: Token Budget, Cost Control, Rate Limiting

Architecture
Agent Runtime

The agent runtime is the execution environment that hosts an AI agent — providing the infrastructure for the agent's reasoning loop, tool invocations, state management, and communication with external systems.

The runtime is where governance must live. By inserting a governance layer at the runtime level — between the agent and the outside world — teams can enforce policies, intercept tool calls, and record execution data without modifying the agent's application code. This is the architectural basis for runtime governance.

In Waxell: The Waxell runtime (launching Phase B, April 2026) is a pluggable layer that deploys between agent code and the external systems it interacts with — adding governance without requiring rewrites of existing agents.

Related: Governance Plane, Pluggable Runtime, Agentic Architecture

Pluggable Runtime

A pluggable runtime is an agent execution layer that can be added to an existing agent deployment without modifying the agent's core logic — plugging in between the agent and the world rather than requiring the agent to be rebuilt around it.

The alternative to a pluggable runtime is a framework-native solution, which only works for agents built on a specific framework (like LangChain or LlamaIndex). A pluggable runtime works with any agent — existing or new, simple or complex — because it operates at the boundary between the agent and external systems, not inside the agent itself.

In Waxell: Waxell is pluggable by design. The Observe SDK integrates in under 60 seconds with any Python agent, and the runtime layer deploys between agents and their tools without requiring changes to agent code.

Related: Agent Runtime, Governance Plane, Agentic Architecture

Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open protocol developed by Anthropic that defines a standard interface for connecting AI agents to external tools, data sources, and services — allowing agents to call external capabilities through a consistent, interoperable format.

MCP is to AI agents what HTTP is to the web: a standard that allows agents to communicate with external tools regardless of who built them. An MCP-compatible agent can call any MCP server; an MCP server can be called by any MCP-compatible agent. This standardization is accelerating the ecosystem of AI-accessible tools dramatically.

In Waxell: Waxell is MCP-native, with first-class support for instrumenting, governing, and auditing MCP tool calls. This makes Waxell the only observability and governance platform built specifically for the MCP ecosystem.

Related: Tool Invocation, Pluggable Runtime, Policy Enforcement

Tool Invocation

Tool invocation is the action of an AI agent calling an external capability — such as a web search, database query, code interpreter, email sender, or API endpoint — to retrieve information or take an action beyond what the model alone can perform.

Tool invocations are where AI agents produce real-world effects. A model call generates text; a tool invocation can send an email, modify a database, or execute a transaction. This is why tool invocations are the primary target of governance policies and the primary subject of audit records.

In Waxell: Waxell tracks and governs every tool invocation — logging inputs, outputs, latency, and success/failure status, and evaluating each invocation against active policies before it executes.

Related: Model Context Protocol (MCP), Policy Enforcement, Audit Trail

Agent Registry

An agent registry is a centralized catalog that defines what agents exist in a system — their identities, capabilities, permitted tool sets, assigned policies, and current status — serving as the source of truth for what is allowed to run and under what conditions.

Without a registry, production AI deployments are invisible at the organizational level: individual teams deploy agents, but there is no central record of what agents exist, what they do, or what governance applies to them. A registry brings the same discipline to AI agents that infrastructure inventories bring to servers and services.

In Waxell: The Waxell Registry is a core component of the governance plane, providing a structured definition layer where every agent's identity, behavior, and policy assignments are recorded before deployment.

Related: Agentic Governance, Governance Plane, Policy Enforcement

Signal Domain

A signal domain is the defined set of data inputs an AI agent is permitted to read, and the defined set of outputs it is permitted to write — creating an explicit interface boundary that governs what information flows in and out of the agent.

Uncontrolled data access is a governance risk: an agent with broad read permissions can inadvertently process PII, access confidential business data, or be manipulated through injected content. Defining a signal domain restricts the agent to only the data it needs, reducing both the attack surface and the compliance footprint.

In Waxell: Signal and Domain are dedicated capability categories in the Waxell platform, providing interface boundary controls that govern the data layer separately from the model and tool layers.

Related: PII Scanning, Content Policy, Governance Plane

Waxell SDK
Waxell Observe SDK

The Waxell Observe SDK is a Python library (pip install waxell-observe) that adds production-grade observability to existing AI agents — automatically tracking LLM calls, tool invocations, token counts, latency, and costs across all major providers with minimal code changes.

The Observe SDK is Waxell's Phase A product, designed to solve the immediate problem of blind AI agent execution. It works by patching supported provider SDKs at import time (auto-instrumentation), capturing structured data about every LLM call, and routing that data to the Waxell dashboard and telemetry pipeline. Supported providers: OpenAI, Anthropic, LiteLLM, Groq, HuggingFace, and LangChain.

In Waxell: The Observe SDK is the entry point to the Waxell platform — deployable in under 60 seconds, requiring zero changes to existing agent logic.

Related: Auto-Instrumentation, LLM Call Tracking, WaxellContext, @observe Decorator

@observe Decorator

The @observe decorator is a Python function decorator provided by the Waxell Observe SDK that wraps any function in a traced execution context — automatically capturing inputs, outputs, latency, and errors as a named span in the agent's execution trace.

Decorators provide a lightweight, explicit way to add observability to custom logic that falls outside the auto-instrumented provider calls. By adding @observe to a function, teams can trace any step in their agent's workflow — not just LLM calls — with a single line of code.

In Waxell: @observe is part of the Waxell Observe SDK and integrates with WaxellContext to attach metadata, tags, and scores to traced steps.

Related: Waxell Observe SDK, WaxellContext, Trace / Tracing

WaxellContext

WaxellContext is the context manager in the Waxell Observe SDK that allows developers to attach structured metadata — such as user IDs, session identifiers, tags, custom scores, and environment labels — to any execution span, enriching trace data beyond raw LLM call metrics.

Raw observability data (token counts, latency, costs) answers what happened. Context metadata answers who did it, where, and why. By attaching a user ID and session ID to every trace, teams can filter and segment observability data by customer, workflow type, or business context.

In Waxell: WaxellContext is the primary interface for enriching Waxell traces with business-level metadata, enabling segmentation and filtering in the Waxell dashboard.

Related: Waxell Observe SDK, @observe Decorator, Session Grouping

Execution

An execution in Waxell is the durable, structured record of a single agent run — capturing every step taken, every decision made, every tool called, every token consumed, and every policy evaluated, from start to finish.

Executions are the atomic unit of the Waxell audit system. Each execution is timestamped, attributed to a specific agent and session, and stored as an immutable record. When something goes wrong in production, executions are what you query to understand exactly what happened.

In Waxell: Executions are surfaced in the Waxell dashboard with full replay capability — allowing teams to step through an agent's reasoning, inspect individual LLM calls and tool invocations, and identify the exact point of failure.

Related: Trace / Tracing, Audit Trail, Session Grouping

Waxell

Waxell provides a governance and orchestration layer for building and operating autonomous agent systems in production.

© 2026 Waxell. All rights reserved.

Patent Pending.

Waxell

Waxell provides a governance and orchestration layer for building and operating autonomous agent systems in production.

© 2026 Waxell. All rights reserved.

Patent Pending.

Waxell

Waxell provides a governance and orchestration layer for building and operating autonomous agent systems in production.

© 2026 Waxell. All rights reserved.

Patent Pending.