Skip to main content

Common Mistakes

1. Importing LLM SDKs before init()

Wrong:

from openai import OpenAI          # Gets un-patched OpenAI
import waxell_observe as waxell
waxell.init() # Too late -- OpenAI already imported

client = OpenAI()
client.chat.completions.create(...) # NOT auto-instrumented

Right:

import waxell_observe as waxell
waxell.init() # Patches OpenAI module

from openai import OpenAI # Gets patched version
client = OpenAI()
client.chat.completions.create(...) # Auto-instrumented

Alternative -- use drop-in imports (order doesn't matter):

from waxell_observe.openai import openai

client = openai.OpenAI()
client.chat.completions.create(...) # Always auto-instrumented

2. Using behavior decorators without @observe

Behavior decorators (@tool, @decision, @retrieval, etc.) only record data when called inside an @observe or WaxellContext scope. Without a parent scope, they're silent no-ops.

Wrong -- no trace is created:

@waxell.tool(tool_type="api")
async def search(query: str):
return await api.search(query)

@waxell.decision(name="route")
async def route(query: str):
return {"chosen": "search"}

# These run fine but nothing is recorded
await route("test")
await search("test")

Right -- wrap the entry point with @observe:

@waxell.tool(tool_type="api")
async def search(query: str):
return await api.search(query)

@waxell.decision(name="route")
async def route(query: str):
return {"chosen": "search"}

@waxell.observe(agent_name="my-agent") # Creates the run scope
async def run_agent(query: str):
decision = await route(query)
results = await search(query)
return results

3. Recording LLM calls manually when auto-instrumentation handles it

If you called waxell.init(), LLM calls are captured automatically. Manually recording them creates duplicates.

Wrong -- double-counted:

waxell.init()  # Auto-instruments OpenAI

@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}],
) # Already auto-captured

# This creates a DUPLICATE record
if waxell_ctx:
waxell_ctx.record_llm_call(model="gpt-4o", tokens_in=100, tokens_out=50)

return response.choices[0].message.content

Right -- let auto-instrumentation handle it:

waxell.init()

@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}],
) # Auto-captured with accurate token counts and cost

waxell.score("quality", 0.9) # Add enrichment, not duplicate LLM records
return response.choices[0].message.content

Only use ctx.record_llm_call() when auto-instrumentation isn't available (e.g., custom HTTP calls to LLM endpoints, or unsupported providers).


4. Forgetting to guard waxell_ctx in unit tests

If your function uses the injected waxell_ctx, it will be None when called without @observe (e.g., in tests).

Wrong -- crashes in tests:

@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await call_llm(query)
waxell_ctx.record_score("quality", 0.9) # AttributeError: NoneType has no attribute 'record_score'
return response

Right -- guard or use convenience functions:

# Option 1: Guard with if
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str, waxell_ctx=None):
response = await call_llm(query)
if waxell_ctx:
waxell_ctx.record_score("quality", 0.9)
return response

# Option 2: Use convenience functions (preferred -- they're no-ops outside a context)
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await call_llm(query)
waxell.score("quality", 0.9) # Safe everywhere
return response

5. Blocking on waxell.flush() in sync code

waxell.flush() is async. Calling it in sync code with asyncio.run() can cause issues if an event loop is already running.

Wrong:

import asyncio

@waxell.observe(agent_name="my-agent")
def sync_agent(query: str):
result = process(query)
asyncio.run(waxell.flush()) # RuntimeError if event loop is running
return result

Right -- use the sync variant:

@waxell.observe(agent_name="my-agent")
def sync_agent(query: str):
result = process(query)
waxell.flush_sync() # Works in sync code
return result

Note: you usually don't need to flush manually. The context flushes automatically on exit.


6. Using WaxellContext when @observe would suffice

WaxellContext is powerful but verbose. For single-function agents, @observe is cleaner.

Verbose:

async def my_agent(query: str):
async with WaxellContext(
agent_name="my-agent",
inputs={"query": query},
) as ctx:
response = await call_llm(query)
ctx.record_score("quality", 0.9)
ctx.set_tag("type", "qa")
ctx.set_result({"answer": response})
return response

Clean:

@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
response = await call_llm(query)
waxell.score("quality", 0.9)
waxell.tag("type", "qa")
return response # Auto-captured as result

Reserve WaxellContext for cases where decorators don't fit: multi-step orchestration, batch loops, conditional context creation, or explicit lifecycle control.


7. Nesting @observe decorators unintentionally

Each @observe creates a separate run. Nesting them creates parent-child runs, which may not be what you want.

Creates 2 runs per call:

@waxell.observe(agent_name="outer")
async def outer(query: str):
return await inner(query)

@waxell.observe(agent_name="inner")
async def inner(query: str):
return await call_llm(query)

If you want a single run with sub-steps, use @observe on the outer function and behavior decorators on inner functions:

Creates 1 run with a tool span:

@waxell.observe(agent_name="my-agent")
async def outer(query: str):
return await inner(query)

@waxell.tool(tool_type="function")
async def inner(query: str):
return await call_llm(query)

Nested @observe is correct for multi-agent architectures where each agent is a distinct run with its own lifecycle and policy checks.


8. Expecting @observe to work without configuration

@observe requires a configured client to create runs on the control plane. Without configuration, it logs a warning and runs your function without observability.

Silent failure:

# No init(), no env vars, no config file
@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
return await call_llm(query)

await my_agent("test")
# WARNING: Client not configured, skipping run start
# Function runs fine, but no data in dashboard

Explicit setup:

import waxell_observe as waxell
waxell.init(api_key="wax_sk_...", api_url="https://acme.waxell.dev")

@waxell.observe(agent_name="my-agent")
async def my_agent(query: str):
return await call_llm(query)

9. Mixing up waxell.tag() and waxell.metadata()

Tags are string-only and searchable in the dashboard and Grafana. Metadata accepts any JSON type but is for contextual information.

Wrong -- complex values as tags:

waxell.tag("config", '{"temperature": 0.7}')  # Stored as a string, not queryable

Right:

waxell.tag("environment", "production")        # String tag -- queryable
waxell.tag("model", "gpt-4o") # String tag -- queryable
waxell.metadata("config", {"temperature": 0.7}) # Structured data -- contextual

10. Not calling handler.flush_sync() with LangChain

The WaxellLangChainHandler buffers data during chain execution. If you don't flush, the run is never completed.

Wrong -- incomplete run:

handler = WaxellLangChainHandler(agent_name="my-chain")
result = chain.invoke(input, config={"callbacks": [handler]})
# Run is started but never completed -- shows as "running" forever in dashboard

Right:

handler = WaxellLangChainHandler(agent_name="my-chain")
result = chain.invoke(input, config={"callbacks": [handler]})
handler.flush_sync(result={"output": result.content})
# Run is completed with result