Auto-Instrumentation
The simplest way to add observability to your AI agents -- two lines of code and all your LLM calls are automatically traced.
Quick Start
import waxell_observe
waxell_observe.init(api_key="wax_sk_...", api_url="https://waxell.dev")
# Import LLM SDKs AFTER init() -- they're now auto-instrumented
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Automatically traced with model, tokens, cost, latency
How It Works
When you call waxell_observe.init(), the SDK:
- Detects installed LLM libraries (OpenAI, Anthropic, etc.)
- Patches their HTTP clients to capture request/response data
- Emits OpenTelemetry spans for each LLM call
- Records token counts, costs, and latencies automatically
LLM calls made outside any @observe decorator or WaxellContext are automatically buffered by a background collector and flushed to auto-generated runs (named auto:{model}). This means you get visibility into every LLM call without any additional code beyond init().
The init() Function
waxell_observe.init(
api_key: str = "", # Waxell API key (wax_sk_...)
api_url: str = "", # Waxell API URL
capture_content: bool = False, # Include prompt/response in traces
instrument: list[str] | None = None, # AI/ML library list (auto-detect if None)
instrument_infra: bool = True, # Auto-instrument infra (HTTP, DB, cache)
infra_libraries: list[str] | None = None, # Only these infra libs (None = all)
infra_exclude: list[str] | None = None, # Exclude these infra libs
resource_attributes: dict | None = None, # Custom OTel resource attributes
debug: bool = False, # Enable debug logging
prompt_guard: bool = False, # Enable client-side prompt guard
prompt_guard_server: bool = False, # Also check server-side guard (ML-powered)
prompt_guard_action: str = "block", # "block", "warn", or "redact"
)
See Installation & Configuration for full parameter details.
Configuration Priority
- Explicit arguments to
init() - Environment variables (
WAXELL_API_KEY,WAXELL_API_URL) - CLI config file (
~/.waxell/config)
Environment Variables
export WAXELL_API_KEY="wax_sk_..."
export WAXELL_API_URL="https://waxell.dev"
export WAXELL_CAPTURE_CONTENT="true" # Include prompts in traces
export WAXELL_DEBUG="true" # Debug logging
Supported Libraries
LLM Providers
| Library | Key | Notes |
|---|---|---|
| OpenAI | openai | Chat, completions, embeddings |
| Anthropic | anthropic | Messages API |
| Google Gemini | gemini | Gemini API |
| AWS Bedrock | bedrock | Bedrock runtime |
| Mistral AI | mistral | Chat, embeddings |
| Cohere | cohere | Chat, embed, rerank |
| Groq | groq | Fast inference |
| LiteLLM | litellm | Unified multi-provider API |
| Ollama | ollama | Local model serving |
| Together AI | together | Together inference API |
| Vertex AI | vertex_ai | Google Cloud AI |
| HuggingFace | huggingface | Inference API |
Vector Databases
| Library | Key | Notes |
|---|---|---|
| Pinecone | pinecone | Managed vector DB |
| ChromaDB | chroma | Embedded vector DB |
| Weaviate | weaviate | Vector search engine |
| Qdrant | qdrant | Vector similarity search |
| Milvus | milvus | Distributed vector DB |
| pgvector | pgvector | PostgreSQL vector extension |
| FAISS | faiss | Facebook AI similarity search |
| LanceDB | lancedb | Serverless vector DB |
Agent Frameworks
| Library | Key | Notes |
|---|---|---|
| LangChain | langchain | Chain and agent orchestration |
| CrewAI | crewai | Multi-agent collaboration |
| OpenAI Agents SDK | openai_agents | OpenAI agent framework |
| AutoGen | autogen | Multi-agent conversations |
| LlamaIndex | llamaindex | Data framework for LLMs |
| Haystack | haystack | NLP pipeline framework |
| PydanticAI | pydanticai | Type-safe AI agents |
| DSPy | dspy | Programming with foundation models |
| Google ADK | google_adk | Google Agent Development Kit |
| Claude Agent SDK | claude_agents | Anthropic agent framework |
Safety & Guardrails
| Library | Key | Notes |
|---|---|---|
| Guardrails AI | guardrails_ai | Output validation |
| NeMo Guardrails | nemo_guardrails | Programmable guardrails |
| LLM Guard | llm_guard | Input/output scanning |
The tables above highlight the most commonly used libraries. The SDK supports 100+ libraries in total across additional categories including embeddings/rerankers, evaluation frameworks, voice/speech, RAG frameworks, local inference engines, and more. The full registry is defined in the instrumentor source.
Selective Instrumentation
To instrument only specific libraries:
waxell_observe.init(
api_key="wax_sk_...",
api_url="https://waxell.dev",
instrument=["openai", "anthropic"], # Only these two
)
Drop-in Imports
Alternative to init() -- import pre-instrumented modules:
# Instead of: from openai import OpenAI
from waxell_observe.openai import openai
client = openai.OpenAI()
response = client.chat.completions.create(...) # Auto-traced
# Instead of: import anthropic
from waxell_observe.anthropic import anthropic
client = anthropic.Anthropic()
response = client.messages.create(...) # Auto-traced
Import Order Matters
Auto-instrumentation patches LLM SDKs when they're imported. You must call init() before importing the SDK:
# CORRECT
import waxell_observe
waxell_observe.init(api_key="...")
from openai import OpenAI # Patched!
# WRONG - OpenAI already imported, won't be patched
from openai import OpenAI
import waxell_observe
waxell_observe.init(api_key="...") # Too late!
Adding Structure to Auto-Instrumented Calls
Auto-instrumentation captures LLM calls automatically. Add structure with decorators or context managers to group calls into runs, record behaviors, and enrich traces.
Decorators + Auto-Instrumentation (Recommended)
The simplest way to add structure -- decorators handle run tracking and behavior recording while init() handles LLM capture:
import waxell_observe as waxell
# Auto-instrument LLM SDKs
waxell.init(api_key="wax_sk_...", api_url="https://waxell.dev")
from openai import AsyncOpenAI
client = AsyncOpenAI()
@waxell.decision(name="classify_intent", options=["question", "action", "chitchat"])
async def classify(query: str) -> dict:
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Classify: {query}"}],
)
# LLM call auto-captured; return value recorded as decision
return {"chosen": "question", "reasoning": response.choices[0].message.content}
@waxell.observe(agent_name="support-bot")
async def handle_query(query: str) -> str:
# Auto-instrumented LLM calls + decorator-recorded behaviors
classification = await classify(query)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}],
)
answer = response.choices[0].message.content
# Enrich with scores and tags
waxell.score("helpfulness", 0.9)
waxell.tag("intent", classification["chosen"])
return answer
Context Manager + Auto-Instrumentation (Alternative)
Use the context manager when you need maximum control over recording -- for example, multiple policy checks or explicit run lifecycle management:
import waxell_observe
waxell_observe.init(api_key="wax_sk_...")
from waxell_observe import WaxellContext
from openai import OpenAI
client = OpenAI()
async with WaxellContext(agent_name="my-agent") as ctx:
# LLM calls are auto-traced AND linked to this context
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Explicit context methods for fine-grained control
ctx.set_tag("user_type", "premium")
ctx.record_step("process", output={"status": "complete"})
await ctx.check_policy() # Mid-execution policy check
Kill Switch
Disable all observability without changing code:
export WAXELL_OBSERVE="false" # or "0" or "no"
When disabled:
init()becomes a no-op- Context managers pass through without recording
- Decorators execute functions without wrapping
- No network calls to Waxell servers
Shutdown
Gracefully flush pending traces before exit:
waxell_observe.shutdown()
This is called automatically on process exit, but explicit shutdown ensures all data is flushed in:
- Serverless functions (Lambda, Cloud Functions)
- Short-lived scripts
- Test suites
Programmatic Control
Manual Instrument/Uninstrument
from waxell_observe.instrumentors import instrument_all, uninstrument_all
# Instrument all detected libraries
results = instrument_all()
# {"openai": True, "anthropic": True, ...}
# Restore original behavior
uninstrument_all()
What Gets Captured
For each LLM call, auto-instrumentation records:
| Field | Description |
|---|---|
model | Model name (gpt-4o, claude-sonnet-4, etc.) |
tokens_in | Input/prompt token count |
tokens_out | Output/completion token count |
cost | Estimated USD cost |
latency | Request duration |
provider | openai, anthropic, etc. |
prompt_preview | First 500 chars of prompt (if capture_content=True) |
response_preview | First 500 chars of response (if capture_content=True) |
Conversation Tracking (Automatic)
When auto-instrumentation is active, waxell automatically captures:
- User messages — extracted from the messages array sent to the LLM
- Agent responses — the final text response (not tool-calling intermediaries)
- Context window metrics — message count, turn count, token utilization
- System prompt tracking — detects system prompt changes across calls
This works across all 13+ supported providers with zero code changes. User messages appear as io:user_message spans and agent responses as io:agent_response spans in the trace timeline.
Deduplication
If you also call waxell.user_message() or waxell.agent_response() manually for the same content that was auto-captured, the duplicate is automatically suppressed.
See Conversation Tracking for full details.
Next Steps
- OpenAI Integration -- Detailed OpenAI patterns
- Anthropic Integration -- Anthropic-specific setup
- Context Manager -- Fine-grained control
- Decorator Pattern -- Function-level tracing
- Conversation Tracking -- Auto-captured conversation data