Waxell Observe

You already have agents -- add observability in 2 lines of code.

Waxell Observe is a lightweight Python package that brings LLM call tracking, cost management, and policy enforcement to any AI agent. It works with any Python agent framework -- LangChain, LlamaIndex, CrewAI, custom code, or anything else. No vendor lock-in, no runtime changes, no migration required.

Fastest Path: Auto-Instrumentation

Two lines to automatically trace all LLM calls across 200+ providers:

import waxell_observe as waxell
waxell.init(api_key="wax_sk_...", api_url="https://acme.waxell.dev")

# Import LLM SDKs AFTER init() -- they're now auto-instrumented
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
# Automatically traced with model, tokens, cost, latency

The Decorator Pattern (Recommended)

Decorators are the primary way to instrument your agents. Wrap functions with @observe and behavior decorators to get structured, rich traces with minimal code:

import waxell_observe as waxell

waxell.init()

from openai import AsyncOpenAI

client = AsyncOpenAI()


@waxell.retrieval(source="pinecone")
async def search_docs(query: str) -> list[dict]:
    return await vector_store.search(query, top_k=10)


@waxell.decision(name="approach", options=["summarize", "compare", "deep_dive"])
async def choose_approach(query: str) -> dict:
    return {"chosen": "deep_dive", "reasoning": "Query asks for detailed analysis"}


@waxell.tool(tool_type="api")
async def run_analysis(docs: list) -> dict:
    return await analysis_service.analyze(docs)


@waxell.observe(agent_name="research-pipeline")
async def run_pipeline(query: str):
    docs = await search_docs(query)
    approach = await choose_approach(query)
    analysis = await run_analysis(docs)

    # Inline enrichment
    waxell.score("quality", 0.92)
    waxell.tag("domain", "research")

    return {"result": analysis, "approach": approach["chosen"]}

Every decorated function inside @observe is automatically recorded as a structured span. No manual ctx.record_*() calls needed.

Decorator Reference

Decorator	Purpose	What it captures
`@waxell.observe()`	Agent run boundary	Inputs, outputs, policy checks, run lifecycle
`@waxell.tool()`	Tool/function calls	Name, inputs, output, duration, status
`@waxell.retrieval()`	RAG search operations	Query, documents, scores, source
`@waxell.decision()`	Routing/classification	Chosen option, reasoning, confidence
`@waxell.reasoning_dec()`	Chain-of-thought	Thought, evidence, conclusion
`@waxell.step_dec()`	Pipeline steps	Step name and output
`@waxell.retry_dec()`	Retry/fallback logic	Attempt count, strategy, errors

Convenience Functions

Use these anywhere inside an @observe scope for inline enrichment:

Function	Purpose
`waxell.score(name, value)`	Quality scores (numeric, boolean, categorical)
`waxell.tag(key, value)`	Searchable key-value tags
`waxell.metadata(key, value)`	Arbitrary structured metadata
`waxell.step(name, output=)`	Quick step recording
`waxell.decide(name, chosen=)`	Inline decision recording
`waxell.retrieve(query=, documents=)`	Inline retrieval recording
`waxell.reason(step=, thought=)`	Inline reasoning recording
`waxell.retry(attempt=, reason=)`	Inline retry recording
`waxell.user_message(content)`	Record inbound user message
`waxell.agent_response(content)`	Record outbound agent response
`waxell.communication(channel=)`	Record outbound messages (Slack, email, etc.)
`waxell.flush()` / `waxell.flush_sync()`	Flush buffered data for long-running agents
`waxell.diagnose()`	Introspect SDK state and configuration

Advanced: Context Manager

For complex scenarios where decorators don't fit -- multi-step orchestration, batch processing, conditional context creation -- use WaxellContext directly:

from waxell_observe import WaxellContext

async with WaxellContext(
    agent_name="research-agent",
    session_id="sess_abc123",
    user_id="user_456",
) as ctx:
    result = await run_research_pipeline(query)
    ctx.record_llm_call(model="claude-sonnet-4", tokens_in=500, tokens_out=200)
    ctx.record_step("summarize", output={"summary": result})
    ctx.set_result({"answer": result})

See the Context Manager page for the full API.

LangChain Integration

Drop-in callback handler for any LangChain chain or agent:

from waxell_observe.integrations.langchain import WaxellLangChainHandler

handler = WaxellLangChainHandler(agent_name="langchain-agent")
result = chain.invoke(input, config={"callbacks": [handler]})
handler.flush_sync(result={"output": result})

What You Get

Feature	Description
LLM Call Tracking	Model, token counts, cost, prompt/response previews for every LLM call
LLM Call Explorer	Browse, filter, and inspect every LLM call with prompt/response viewer
Session Tracking	Group related runs by session for conversation-level analytics
User Tracking	Per-user cost attribution, usage patterns, and analytics
Scoring	Capture quality scores via SDK or UI annotations
Annotation Queues	Human review workflows for manual quality assessment
Prompt Management	Version-controlled prompts with labels, playground, and SDK retrieval
Cost Analytics	Model usage breakdown, per-user costs, custom pricing overrides
Policy Enforcement	Pre-execution and mid-execution checks with allow/block/warn/throttle actions
Behavior Tracking	Structured spans for tools, retrievals, decisions, reasoning, retries
Approval Workflows	Human-in-the-loop approval for policy-blocked actions
Conversation Tracking	Auto-captured conversation state, context utilization, message counts

Framework Compatibility

Waxell Observe works with any Python agent framework:

OpenAI -- auto-instrumentation or decorators
Anthropic -- auto-instrumentation or decorators
LangChain / LangGraph -- first-class callback handler
LiteLLM -- unified API for 100+ providers
LlamaIndex -- auto-instrumentation or decorators
CrewAI -- auto-instrumentation or decorators
Custom frameworks -- decorators or context manager
Any Python code -- if it runs Python, you can observe it

Next Steps

Quickstart -- Get up and running in 5 minutes
Decorator Pattern -- Full @observe reference with all parameters
Auto-Instrumentation -- Zero-code tracing for 200+ libraries
Behavior Tracking -- Deep dive into tools, retrievals, decisions, reasoning
Claude Skills -- Let your coding agent instrument and govern your agents for you
Examples on GitHub -- Complete runnable agents for every provider and pattern
FAQ -- Answers to common questions

Fastest Path: Auto-Instrumentation​

The Decorator Pattern (Recommended)​

Decorator Reference​

Convenience Functions​

Advanced: Context Manager​

LangChain Integration​

What You Get​

Framework Compatibility​

Next Steps​