Common Mistakes
1. Importing LLM SDKs before init()
Wrong:
from openai import OpenAI # Gets un-patched OpenAI
import waxell_observe as waxell
waxell.init() # Too late -- OpenAI already imported
client = OpenAI()
client.chat.completions.create(...) # NOT auto-instrumented
Right:
import waxell_observe as waxell
waxell.init() # Patches OpenAI module
from openai import OpenAI # Gets patched version
client = OpenAI()
client.chat.completions.create(...) # Auto-instrumented
Alternative -- use drop-in imports (order doesn't matter):
from waxell_observe.openai import openai
client = openai.OpenAI()
client.chat.completions.create(...) # Always auto-instrumented
2. Using behavior decorators without @observe
Behavior decorators (@tool, @decision, @retrieval, etc.) only record data when called inside an @observe or WaxellContext scope. Without a parent scope, they're silent no-ops.
Wrong -- no trace is created:
@waxell.tool(tool_type="api")
async def search(query: str):
return await api.search(query)
@waxell.decision(name="route")
async def route(query: str):
return {"chosen": "search"}
# These run fine but nothing is recorded
await route("test")
await search("test")
Right -- wrap the entry point with @observe:
@waxell.tool(tool_type="api")
async def search(query: str):
return await api.search(query)
@waxell.decision(name="route")
async def route(query: str):
return {"chosen": "search"}
@waxell.observe(agent_name="my-agent") # Creates the run scope
async def run_agent(query: str):
decision = await route(query)
results = await search(query)
return results
3. Recording LLM calls manually when auto-instrumentation handles it
If you called waxell.init(), LLM calls are captured automatically. Manually recording them creates duplicates.
Wrong -- double-counted:
waxell.init() # Auto-instruments OpenAI
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}],
) # Already auto-captured
# This creates a DUPLICATE record
if waxell_ctx:
waxell_ctx.record_llm_call(model="gpt-4o", tokens_in=100, tokens_out=50)
return response.choices[0].message.content
Right -- let auto-instrumentation handle it:
waxell.init()
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}],
) # Auto-captured with accurate token counts and cost
waxell.score("quality", 0.9) # Add enrichment, not duplicate LLM records
return response.choices[0].message.content
Only use ctx.record_llm_call() when auto-instrumentation isn't available (e.g., custom HTTP calls to LLM endpoints, or unsupported providers).
4. Forgetting to guard waxell_ctx in unit tests
If your function uses the injected waxell_ctx, it will be None when called without @observe (e.g., in tests).
Wrong -- crashes in tests:
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await call_llm(query)
waxell_ctx.record_score("quality", 0.9) # AttributeError: NoneType has no attribute 'record_score'
return response
Right -- guard or use convenience functions:
# Option 1: Guard with if
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await call_llm(query)
if waxell_ctx:
waxell_ctx.record_score("quality", 0.9)
return response
# Option 2: Use convenience functions (preferred -- they're no-ops outside a context)
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await call_llm(query)
waxell.score("quality", 0.9) # Safe everywhere
return response
5. Blocking on waxell.flush() in sync code
waxell.flush() is async. Calling it in sync code with asyncio.run() can cause issues if an event loop is already running.
Wrong:
import asyncio
@waxell.observe(agent_name="my-agent")
def sync_agent(query: str):
result = process(query)
asyncio.run(waxell.flush()) # RuntimeError if event loop is running
return result
Right -- use the sync variant:
@waxell.observe(agent_name="my-agent")
def sync_agent(query: str):
result = process(query)
waxell.flush_sync() # Works in sync code
return result
Note: you usually don't need to flush manually. The context flushes automatically on exit.
6. Using WaxellContext when @observe would suffice
WaxellContext is powerful but verbose. For single-function agents, @observe is cleaner.
Verbose:
async def my_agent(query: str):
async with WaxellContext(
agent_name="my-agent",
inputs={"query": query},
) as ctx:
response = await call_llm(query)
ctx.record_score("quality", 0.9)
ctx.set_tag("type", "qa")
ctx.set_result({"answer": response})
return response
Clean:
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await call_llm(query)
waxell.score("quality", 0.9)
waxell.tag("type", "qa")
return response # Auto-captured as result
Reserve WaxellContext for cases where decorators don't fit: multi-step orchestration, batch loops, conditional context creation, or explicit lifecycle control.
7. Nesting @observe decorators unintentionally
Each @observe creates a separate run. Nesting them creates parent-child runs, which may not be what you want.
Creates 2 runs per call:
@waxell.observe(agent_name="outer")
async def outer(query: str):
return await inner(query)
@waxell.observe(agent_name="inner")
async def inner(query: str):
return await call_llm(query)
If you want a single run with sub-steps, use @observe on the outer function and behavior decorators on inner functions:
Creates 1 run with a tool span:
@waxell.observe(agent_name="my-agent")
async def outer(query: str):
return await inner(query)
@waxell.tool(tool_type="function")
async def inner(query: str):
return await call_llm(query)
Nested @observe is correct for multi-agent architectures where each agent is a distinct run with its own lifecycle and policy checks.
8. Expecting @observe to work without configuration
@observe requires a configured client to create runs on the control plane. Without configuration, it logs a warning and runs your function without observability.
Silent failure:
# No init(), no env vars, no config file
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
return await call_llm(query)
await my_agent("test")
# WARNING: Client not configured, skipping run start
# Function runs fine, but no data in dashboard
Explicit setup:
import waxell_observe as waxell
waxell.init(api_key="wax_sk_...", api_url="https://acme.waxell.dev")
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
return await call_llm(query)
9. Mixing up waxell.tag() and waxell.metadata()
Tags are string-only and searchable in the dashboard and Grafana. Metadata accepts any JSON type but is for contextual information.
Wrong -- complex values as tags:
waxell.tag("config", '{"temperature": 0.7}') # Stored as a string, not queryable
Right:
waxell.tag("environment", "production") # String tag -- queryable
waxell.tag("model", "gpt-4o") # String tag -- queryable
waxell.metadata("config", {"temperature": 0.7}) # Structured data -- contextual
10. Not calling handler.flush_sync() with LangChain
The WaxellLangChainHandler buffers data during chain execution. If you don't flush, the run is never completed.
Wrong -- incomplete run:
handler = WaxellLangChainHandler(agent_name="my-chain")
result = chain.invoke(input, config={"callbacks": [handler]})
# Run is started but never completed -- shows as "running" forever in dashboard
Right:
handler = WaxellLangChainHandler(agent_name="my-chain")
result = chain.invoke(input, config={"callbacks": [handler]})
handler.flush_sync(result={"output": result.content})
# Run is completed with result