Skip to main content

Waxell Observe

You already have agents -- add observability in 2 lines of code.

Waxell Observe is a lightweight Python package that brings LLM call tracking, cost management, and policy enforcement to any AI agent. It works with any Python agent framework -- LangChain, LlamaIndex, CrewAI, custom code, or anything else. No vendor lock-in, no runtime changes, no migration required.

Fastest Path: Auto-Instrumentation

Two lines to automatically trace all LLM calls across 200+ providers:

import waxell_observe as waxell
waxell.init(api_key="wax_sk_...", api_url="https://acme.waxell.dev")

# Import LLM SDKs AFTER init() -- they're now auto-instrumented
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Automatically traced with model, tokens, cost, latency

Decorators are the primary way to instrument your agents. Wrap functions with @observe and behavior decorators to get structured, rich traces with minimal code:

import waxell_observe as waxell

waxell.init()

from openai import AsyncOpenAI

client = AsyncOpenAI()


@waxell.retrieval(source="pinecone")
async def search_docs(query: str) -> list[dict]:
return await vector_store.search(query, top_k=10)


@waxell.decision(name="approach", options=["summarize", "compare", "deep_dive"])
async def choose_approach(query: str) -> dict:
return {"chosen": "deep_dive", "reasoning": "Query asks for detailed analysis"}


@waxell.tool(tool_type="api")
async def run_analysis(docs: list) -> dict:
return await analysis_service.analyze(docs)


@waxell.observe(agent_name="research-pipeline")
async def run_pipeline(query: str):
docs = await search_docs(query)
approach = await choose_approach(query)
analysis = await run_analysis(docs)

# Inline enrichment
waxell.score("quality", 0.92)
waxell.tag("domain", "research")

return {"result": analysis, "approach": approach["chosen"]}

Every decorated function inside @observe is automatically recorded as a structured span. No manual ctx.record_*() calls needed.

Decorator Reference

DecoratorPurposeWhat it captures
@waxell.observe()Agent run boundaryInputs, outputs, policy checks, run lifecycle
@waxell.tool()Tool/function callsName, inputs, output, duration, status
@waxell.retrieval()RAG search operationsQuery, documents, scores, source
@waxell.decision()Routing/classificationChosen option, reasoning, confidence
@waxell.reasoning_dec()Chain-of-thoughtThought, evidence, conclusion
@waxell.step_dec()Pipeline stepsStep name and output
@waxell.retry_dec()Retry/fallback logicAttempt count, strategy, errors

Convenience Functions

Use these anywhere inside an @observe scope for inline enrichment:

FunctionPurpose
waxell.score(name, value)Quality scores (numeric, boolean, categorical)
waxell.tag(key, value)Searchable key-value tags
waxell.metadata(key, value)Arbitrary structured metadata
waxell.step(name, output=)Quick step recording
waxell.decide(name, chosen=)Inline decision recording
waxell.retrieve(query=, documents=)Inline retrieval recording
waxell.reason(step=, thought=)Inline reasoning recording
waxell.retry(attempt=, reason=)Inline retry recording
waxell.user_message(content)Record inbound user message
waxell.agent_response(content)Record outbound agent response
waxell.communication(channel=)Record outbound messages (Slack, email, etc.)
waxell.flush() / waxell.flush_sync()Flush buffered data for long-running agents
waxell.diagnose()Introspect SDK state and configuration

Advanced: Context Manager

For complex scenarios where decorators don't fit -- multi-step orchestration, batch processing, conditional context creation -- use WaxellContext directly:

from waxell_observe import WaxellContext

async with WaxellContext(
agent_name="research-agent",
session_id="sess_abc123",
user_id="user_456",
) as ctx:
result = await run_research_pipeline(query)
ctx.record_llm_call(model="claude-sonnet-4", tokens_in=500, tokens_out=200)
ctx.record_step("summarize", output={"summary": result})
ctx.set_result({"answer": result})

See the Context Manager page for the full API.

LangChain Integration

Drop-in callback handler for any LangChain chain or agent:

from waxell_observe.integrations.langchain import WaxellLangChainHandler

handler = WaxellLangChainHandler(agent_name="langchain-agent")
result = chain.invoke(input, config={"callbacks": [handler]})
handler.flush_sync(result={"output": result})

What You Get

FeatureDescription
LLM Call TrackingModel, token counts, cost, prompt/response previews for every LLM call
LLM Call ExplorerBrowse, filter, and inspect every LLM call with prompt/response viewer
Session TrackingGroup related runs by session for conversation-level analytics
User TrackingPer-user cost attribution, usage patterns, and analytics
ScoringCapture quality scores via SDK or UI annotations
Annotation QueuesHuman review workflows for manual quality assessment
Prompt ManagementVersion-controlled prompts with labels, playground, and SDK retrieval
Cost AnalyticsModel usage breakdown, per-user costs, custom pricing overrides
Policy EnforcementPre-execution and mid-execution checks with allow/block/warn/throttle actions
Behavior TrackingStructured spans for tools, retrievals, decisions, reasoning, retries
Approval WorkflowsHuman-in-the-loop approval for policy-blocked actions
Conversation TrackingAuto-captured conversation state, context utilization, message counts

Framework Compatibility

Waxell Observe works with any Python agent framework:

  • OpenAI -- auto-instrumentation or decorators
  • Anthropic -- auto-instrumentation or decorators
  • LangChain / LangGraph -- first-class callback handler
  • LiteLLM -- unified API for 100+ providers
  • LlamaIndex -- auto-instrumentation or decorators
  • CrewAI -- auto-instrumentation or decorators
  • Custom frameworks -- decorators or context manager
  • Any Python code -- if it runs Python, you can observe it

Next Steps