Skip to main content

Phase 1: Add Observability

You are here if: you have existing AI agents (LangChain, CrewAI, custom Python, or anything else) and you want to add observability without rewriting them.

What you will have after this phase: execution traces, LLM call visibility, cost tracking, and basic policy enforcement -- all without changing your agent logic.


What You Get

Execution Run Tracking

Every agent invocation becomes a tracked run in the Waxell control plane. Each run records:

  • Agent name and workflow
  • Start time and duration
  • Input parameters and output results
  • Final status (success, error, or policy-blocked)

LLM Call Visibility

See exactly what your agents are doing under the hood:

  • Which models are being called and how often
  • Token counts (prompt and completion) per call
  • Prompt and response previews for debugging
  • Cost estimates based on per-model pricing tables

Cost Tracking

Automatic cost estimation for 20+ LLM models. The control plane aggregates costs by agent, workflow, tenant, and time period. Set budget policies to get alerts or block execution when spending exceeds thresholds.

Policy Enforcement

Pre-execution policy checks run before your agent starts. Policies can:

  • Allow the run to proceed
  • Block the run with a PolicyViolationError
  • Warn (allow but log a warning)
  • Throttle (rate-limit executions)

How to Add It

Choose the integration pattern that fits your codebase. All three patterns provide the same observability features.

Pattern 1: Decorator (Simplest)

Best for standalone agent functions.

from waxell_observe import waxell_agent

@waxell_agent(agent_name="my-agent")
async def run_my_agent(query: str, waxell_ctx=None) -> str:
result = await my_llm_call(query)

# Optional: record individual LLM calls for cost tracking
if waxell_ctx:
waxell_ctx.record_llm_call(
model="gpt-4o",
tokens_in=150,
tokens_out=80,
)

return result

The decorator handles run lifecycle automatically: it starts a run on entry, captures inputs/outputs, checks policies, and completes the run on exit.

Pattern 2: Context Manager (Fine-grained)

Best when you need explicit control over what gets recorded.

from waxell_observe import WaxellContext

async with WaxellContext(agent_name="my-agent") as ctx:
# Your existing agent code -- unchanged
result = await my_agent.run(query)

# Record LLM calls
ctx.record_llm_call(model="gpt-4o", tokens_in=500, tokens_out=200)

# Record execution steps
ctx.record_step("classify", output={"category": "billing"})
ctx.record_step("generate_response", output={"length": 250})

# Set the final result
ctx.set_result({"answer": result})

The context manager gives you access to record_llm_call(), record_step(), set_result(), and check_policy() for mid-execution policy checks.

Pattern 3: LangChain Callback (Zero-change)

Best for LangChain agents. LLM calls are captured automatically via callback hooks.

from waxell_observe.integrations.langchain import WaxellLangChainHandler

handler = WaxellLangChainHandler(agent_name="my-langchain-agent")

# Your existing chain -- unchanged
result = chain.invoke(
{"question": "How do I reset my password?"},
config={"callbacks": [handler]},
)

# Flush buffered telemetry when done
handler.flush_sync(result={"output": result.content})

The handler automatically intercepts LLM calls, chain executions, and tool usage. No manual record_llm_call() needed.


What to Do After Setup

1. Review Your Dashboard

Log in to your Waxell control plane and explore the runs view. You should see:

  • Recent agent executions with status and duration
  • LLM calls with model, token, and cost breakdowns
  • Execution steps showing the flow of each run

2. Set Up Cost Alerts

In the control plane, configure budget policies for your agents:

  • Daily token limits: Block or warn when an agent exceeds a token budget
  • Per-run cost limits: Prevent expensive individual runs
  • Aggregate spend alerts: Get notified when total spend crosses a threshold

3. Configure Policies

Start with conservative policies and tune as you learn your agents' behavior:

  • Rate limiting: Prevent runaway loops from overwhelming your LLM provider
  • Budget enforcement: Set spending limits per agent and per tenant
  • Execution policies: Control which agents can run and when

4. Explore Agent Behavior

With observability in place, you can now answer questions you could not before:

  • Which agents are the most expensive to run?
  • What is the average token usage per agent invocation?
  • Are there agents that fail frequently? Why?
  • How much does each LLM call cost relative to the value it produces?

What You Do Not Get (Yet)

Phase 1 focuses on observability and basic governance. The following capabilities require further migration:

  • Durable workflows: Checkpoint/resume requires native Waxell workflows (Phase 4)
  • Approval workflows: Pause/resume with human-in-the-loop requires native Waxell (Phase 4)
  • Signal-driven execution: Webhook triggering requires Phase 2
  • Centralized orchestration: Coordinating multiple agents from one place requires Phase 2

Next Steps