Auto-Instrumentation

The simplest way to add observability to your AI agents -- two lines of code and all your LLM calls are automatically traced.

Quick Start

import waxell_observe
waxell_observe.init(api_key="wax_sk_...", api_url="https://waxell.dev")

# Import LLM SDKs AFTER init() -- they're now auto-instrumented
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
# Automatically traced with model, tokens, cost, latency

How It Works

When you call waxell_observe.init(), the SDK:

Detects installed LLM libraries (OpenAI, Anthropic, etc.)
Patches their HTTP clients to capture request/response data
Emits OpenTelemetry spans for each LLM call
Records token counts, costs, and latencies automatically

LLM calls made outside any @observe decorator or WaxellContext are automatically buffered by a background collector and flushed to auto-generated runs (named auto:{model}). This means you get visibility into every LLM call without any additional code beyond init().

The init() Function

waxell_observe.init(
    api_key: str = "",                      # Waxell API key (wax_sk_...)
    api_url: str = "",                      # Waxell API URL
    capture_content: bool = False,          # Include prompt/response in traces
    instrument: list[str] | None = None,    # AI/ML library list (auto-detect if None)
    instrument_infra: bool = True,          # Auto-instrument infra (HTTP, DB, cache)
    infra_libraries: list[str] | None = None,  # Only these infra libs (None = all)
    infra_exclude: list[str] | None = None, # Exclude these infra libs
    resource_attributes: dict | None = None,  # Custom OTel resource attributes
    debug: bool = False,                    # Enable debug logging
    prompt_guard: bool = False,             # Enable client-side prompt guard
    prompt_guard_server: bool = False,      # Also check server-side guard (ML-powered)
    prompt_guard_action: str = "block",     # "block", "warn", or "redact"
)

See Installation & Configuration for full parameter details.

Configuration Priority

Explicit arguments to init()
Environment variables (WAXELL_API_KEY, WAXELL_API_URL)
CLI config file (~/.waxell/config)

Environment Variables

export WAXELL_API_KEY="wax_sk_..."
export WAXELL_API_URL="https://waxell.dev"
export WAXELL_CAPTURE_CONTENT="true"  # Include prompts in traces
export WAXELL_DEBUG="true"            # Debug logging

Supported Libraries

LLM Providers

Library	Key	Notes
OpenAI	`openai`	Chat, completions, embeddings
Anthropic	`anthropic`	Messages API
Google Gemini	`gemini`	Gemini API
AWS Bedrock	`bedrock`	Bedrock runtime
Mistral AI	`mistral`	Chat, embeddings
Cohere	`cohere`	Chat, embed, rerank
Groq	`groq`	Fast inference
LiteLLM	`litellm`	Unified multi-provider API
Ollama	`ollama`	Local model serving
Together AI	`together`	Together inference API
Vertex AI	`vertex_ai`	Google Cloud AI
HuggingFace	`huggingface`	Inference API

Vector Databases

Library	Key	Notes
Pinecone	`pinecone`	Managed vector DB
ChromaDB	`chroma`	Embedded vector DB
Weaviate	`weaviate`	Vector search engine
Qdrant	`qdrant`	Vector similarity search
Milvus	`milvus`	Distributed vector DB
pgvector	`pgvector`	PostgreSQL vector extension
FAISS	`faiss`	Facebook AI similarity search
LanceDB	`lancedb`	Serverless vector DB

Agent Frameworks

Library	Key	Notes
LangChain	`langchain`	Chain and agent orchestration
CrewAI	`crewai`	Multi-agent collaboration
OpenAI Agents SDK	`openai_agents`	OpenAI agent framework
AutoGen	`autogen`	Multi-agent conversations
LlamaIndex	`llamaindex`	Data framework for LLMs
Haystack	`haystack`	NLP pipeline framework
PydanticAI	`pydanticai`	Type-safe AI agents
DSPy	`dspy`	Programming with foundation models
Google ADK	`google_adk`	Google Agent Development Kit
Claude Agent SDK	`claude_agents`	Anthropic agent framework

Safety & Guardrails

Library	Key	Notes
Guardrails AI	`guardrails_ai`	Output validation
NeMo Guardrails	`nemo_guardrails`	Programmable guardrails
LLM Guard	`llm_guard`	Input/output scanning

info

The tables above highlight the most commonly used libraries. The SDK supports 200+ libraries in total across additional categories including embeddings/rerankers, evaluation frameworks, voice/speech, RAG frameworks, local inference engines, and more. The full registry is defined in the instrumentor source.

Selective Instrumentation

To instrument only specific libraries:

waxell_observe.init(
    api_key="wax_sk_...",
    api_url="https://waxell.dev",
    instrument=["openai", "anthropic"],  # Only these two
)

Drop-in Imports

Alternative to init() -- import pre-instrumented modules:

# Instead of: from openai import OpenAI
from waxell_observe.openai import openai

client = openai.OpenAI()
response = client.chat.completions.create(...)  # Auto-traced

# Instead of: import anthropic
from waxell_observe.anthropic import anthropic

client = anthropic.Anthropic()
response = client.messages.create(...)  # Auto-traced

Import Order Matters

Auto-instrumentation patches LLM SDKs when they're imported. You must call init() before importing the SDK:

# CORRECT
import waxell_observe
waxell_observe.init(api_key="...")

from openai import OpenAI  # Patched!

# WRONG - OpenAI already imported, won't be patched
from openai import OpenAI

import waxell_observe
waxell_observe.init(api_key="...")  # Too late!

Adding Structure to Auto-Instrumented Calls

Auto-instrumentation captures LLM calls automatically. Add structure with decorators or context managers to group calls into runs, record behaviors, and enrich traces.

Decorators + Auto-Instrumentation (Recommended)

The simplest way to add structure -- decorators handle run tracking and behavior recording while init() handles LLM capture:

import waxell_observe as waxell

# Auto-instrument LLM SDKs
waxell.init(api_key="wax_sk_...", api_url="https://waxell.dev")

from openai import AsyncOpenAI

client = AsyncOpenAI()

@waxell.decision(name="classify_intent", options=["question", "action", "chitchat"])
async def classify(query: str) -> dict:
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Classify: {query}"}],
    )
    # LLM call auto-captured; return value recorded as decision
    return {"chosen": "question", "reasoning": response.choices[0].message.content}

@waxell.observe(agent_name="support-bot")
async def handle_query(query: str) -> str:
    # Auto-instrumented LLM calls + decorator-recorded behaviors
    classification = await classify(query)

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": query}],
    )
    answer = response.choices[0].message.content

    # Enrich with scores and tags
    waxell.score("helpfulness", 0.9)
    waxell.tag("intent", classification["chosen"])
    return answer

Context Manager + Auto-Instrumentation (Alternative)

Use the context manager when you need maximum control over the run lifecycle -- for example, mid-execution policy checks across many calls or explicit start/complete handling. Even inside a WaxellContext, LLM calls are still auto-captured -- no manual record_llm_call needed.

import waxell_observe
waxell_observe.init(api_key="wax_sk_...")

from waxell_observe import WaxellContext
from openai import OpenAI

client = OpenAI()

async with WaxellContext(agent_name="my-agent") as ctx:
    # LLM call auto-traced AND linked to this context -- no manual record needed
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )

    # Use ctx for things auto-instrumentation can't infer
    ctx.set_tag("user_type", "premium")
    await ctx.check_policy()  # Mid-execution policy check

Kill Switch

Disable all observability without changing code:

export WAXELL_OBSERVE="false"  # or "0" or "no"

When disabled:

init() becomes a no-op
Context managers pass through without recording
Decorators execute functions without wrapping
No network calls to Waxell servers

Shutdown

Gracefully flush pending traces before exit:

waxell_observe.shutdown()

This is called automatically on process exit, but explicit shutdown ensures all data is flushed in:

Serverless functions (Lambda, Cloud Functions)
Short-lived scripts
Test suites

Programmatic Control

Manual Instrument/Uninstrument

from waxell_observe.instrumentors import instrument_all, uninstrument_all

# Instrument all detected libraries
results = instrument_all()
# {"openai": True, "anthropic": True, ...}

# Restore original behavior
uninstrument_all()

What Gets Captured

For each LLM call, auto-instrumentation records:

Field	Description
`model`	Model name (gpt-4o, claude-sonnet-4, etc.)
`tokens_in`	Input/prompt token count
`tokens_out`	Output/completion token count
`cost`	Estimated USD cost
`latency`	Request duration
`provider`	openai, anthropic, etc.
`prompt_preview`	First 500 chars of prompt (if `capture_content=True`)
`response_preview`	First 500 chars of response (if `capture_content=True`)

Conversation Tracking (Automatic)

When auto-instrumentation is active, waxell automatically captures:

User messages — extracted from the messages array sent to the LLM
Agent responses — the final text response (not tool-calling intermediaries)
Context window metrics — message count, turn count, token utilization
System prompt tracking — detects system prompt changes across calls

This works across all 13+ supported providers with zero code changes. User messages appear as io:user_message spans and agent responses as io:agent_response spans in the trace timeline.

Deduplication

If you also call waxell.user_message() or waxell.agent_response() manually for the same content that was auto-captured, the duplicate is automatically suppressed.

See Conversation Tracking for full details.

Next Steps

OpenAI Integration -- Detailed OpenAI patterns
Anthropic Integration -- Anthropic-specific setup
Context Manager -- Fine-grained control
Decorator Pattern -- Function-level tracing
Conversation Tracking -- Auto-captured conversation data

Quick Start​

How It Works​

The init() Function​

Configuration Priority​

Environment Variables​

Supported Libraries​

Selective Instrumentation​

Drop-in Imports​

Import Order Matters​

Adding Structure to Auto-Instrumented Calls​

Decorators + Auto-Instrumentation (Recommended)​

Context Manager + Auto-Instrumentation (Alternative)​

Kill Switch​

Shutdown​

Programmatic Control​

Manual Instrument/Uninstrument​

What Gets Captured​

Conversation Tracking (Automatic)​

Deduplication​

Next Steps​