Instrument OpenAI Directly

This tutorial covers four ways to add observability to OpenAI (or any LLM provider) calls, ordered from simplest to most control.

Start with Auto-Instrumentation

Most users only need Step 1 (auto-instrumentation). It captures every LLM call with zero code changes. Add decorators (Step 2) when you want named traces and enrichment.

Prerequisites

Python 3.10+
waxell-observe installed (pip install waxell-observe)
OpenAI API key set as OPENAI_API_KEY
Waxell API credentials configured

What You'll Learn

Four instrumentation approaches: auto-instrumentation, decorator, context manager, and manual
When to use each approach
How to add metadata, tags, and scores
How to view results in the LLM Calls explorer

Setup

Configure your Waxell credentials via environment variables:

export WAXELL_API_URL="https://acme.waxell.dev"
export WAXELL_API_KEY="wax_sk_..."
export OPENAI_API_KEY="sk-..."

Then initialize in your code:

import waxell_observe as waxell

waxell.init()  # reads WAXELL_API_URL and WAXELL_API_KEY from env

Or pass credentials directly:

import waxell_observe as waxell

waxell.init(api_key="wax_sk_...", api_url="https://acme.waxell.dev")

Step 1: Auto-Instrumentation (Recommended)

The simplest approach -- call init() before importing OpenAI and every call is captured automatically. No decorators, no context managers, no manual recording.

import waxell_observe as waxell

waxell.init()  # must come before importing openai

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(response.choices[0].message.content)
# Automatically traced with model, tokens, cost, latency

That's it. Every OpenAI call is now captured with model name, token counts, cost estimates, and latency. LLM calls made outside any decorator or context manager are automatically buffered and flushed to auto-generated runs.

Step 2: `@observe` Decorator (Named Traces)

Add @observe when you want named traces with automatic IO capture, enrichment, and policy enforcement. Combined with init(), LLM calls are recorded automatically -- you just add structure and metadata.

import waxell_observe as waxell

waxell.init()

from openai import OpenAI

client = OpenAI()


@waxell.observe(agent_name="my-chatbot")
def chat(message: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": message}],
    )

    text = response.choices[0].message.content

    # Enrich the trace with top-level convenience functions
    waxell.tag("domain", "general-knowledge")
    waxell.score("answer_length", len(text))
    waxell.metadata("model", "gpt-4o")

    return text


# Use it -- just call your function normally
result = chat("What is the capital of France?")
print(result)

The decorator handles:

Starting and completing the execution run
Capturing function inputs and outputs
Checking policies before execution (set enforce_policy=False to skip)
Creating an OTel trace span

LLM Call Recording

With init() active, OpenAI calls are recorded automatically. You don't need waxell_ctx.record_llm_call() unless you're using a provider that isn't auto-instrumented.

Async version:

@waxell.observe(agent_name="my-chatbot")
async def chat(message: str) -> str:
    response = await openai.AsyncOpenAI().chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": message}],
    )
    return response.choices[0].message.content

Step 3: Context Manager (Advanced)

Use WaxellContext when you need explicit control over session IDs, user tracking, or when your agent logic spans multiple functions.

import waxell_observe as waxell
from waxell_observe import WaxellContext

waxell.init()

from openai import OpenAI

client = OpenAI()


def chat_with_context(message: str, session_id: str, user_id: str) -> str:
    with WaxellContext(
        agent_name="my-chatbot",
        workflow_name="chat",
        inputs={"message": message},
        session_id=session_id,
        user_id=user_id,
    ) as ctx:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": message}],
        )

        text = response.choices[0].message.content

        # LLM call recorded automatically by init()
        # Add structured metadata
        ctx.set_tag("environment", "production")
        ctx.set_result({"output": text})

        return text


result = chat_with_context(
    message="What is the capital of France?",
    session_id="sess_abc123",
    user_id="user_42",
)

The context manager gives you access to:

ctx.record_step() -- record execution steps
ctx.record_score() -- attach quality scores
ctx.set_tag() -- add searchable tags
ctx.set_metadata() -- add metadata to the trace
ctx.set_result() -- set the run result
ctx.run_id -- the assigned run ID

Step 4: Manual Recording (Legacy / Custom Providers)

For custom LLM providers that aren't auto-instrumented, use WaxellObserveClient directly. This is the most verbose approach and is only recommended when the other patterns don't fit.

from waxell_observe import WaxellObserveClient

client = WaxellObserveClient(
    api_url="https://acme.waxell.dev",
    api_key="wax_sk_...",
)

async def chat_manual(message: str) -> str:
    # 1. Start a run
    run_info = await client.start_run(
        agent_name="my-chatbot",
        workflow_name="chat",
        inputs={"message": message},
    )

    try:
        # 2. Make your LLM call (any provider)
        response = call_custom_llm(message)  # your custom LLM call

        # 3. Record the LLM call
        await client.record_llm_calls(
            run_id=run_info.run_id,
            calls=[
                {
                    "model": "custom-model-v2",
                    "tokens_in": 150,
                    "tokens_out": 80,
                    "cost": 0.003,
                    "task": "chat",
                }
            ],
        )

        # 4. Complete the run
        await client.complete_run(
            run_id=run_info.run_id,
            result={"output": response},
            status="success",
        )

        return response

    except Exception as e:
        await client.complete_run(
            run_id=run_info.run_id,
            status="error",
            error=str(e),
        )
        raise

Compare Approaches

Feature	Auto-Instrumentation	Decorator	Context Manager	Manual
Lines of code	2	Few	Moderate	Most
LLM call capture	Automatic	Automatic (with `init()`)	Automatic (with `init()`)	Manual
IO capture	--	Automatic	Manual via `set_result()`	Manual
Session/user tracking	--	Via params	Via params	Via metadata
Policy enforcement	--	Automatic	Automatic	Call `check_policy()` yourself
OTel tracing	Automatic	Automatic	Automatic	Not included
Error handling	--	Automatic	Automatic	Manual try/catch
Custom providers	Supported providers only	Yes	Yes	Any provider

When to use each:

Auto-instrumentation -- Best for most cases. Two lines and you're done. Start here.
Decorator -- Add when you want named agent traces, IO capture, or enrichment (tags, scores, metadata).
Context Manager -- Advanced. Use when you need explicit session IDs, user tracking, or multi-function workflows.
Manual -- Legacy / custom providers. Only when auto-instrumentation doesn't support your provider.

Viewing Results in the LLM Calls Explorer

After instrumenting your code, every recorded LLM call appears in the Observability > LLM Calls explorer. You can filter by:

Model -- See all calls to a specific model
Agent -- Filter to a specific agent
Cost range -- Find expensive calls
Token range -- Find high-token calls
Date range -- Narrow to a time window
Search -- Full-text search across model, task, prompt, and response previews

Each LLM call record includes:

Model name, task label
Input/output token counts and total
Estimated cost (using system or custom model pricing)
Prompt and response previews (first 500 characters)
Linked agent run and workflow
Timestamp

Next Steps

Auto-Instrumentation -- Full init() reference and supported libraries
Decorator Pattern -- Full @observe reference
Context Manager -- Full WaxellContext reference
Track a RAG Pipeline -- End-to-end RAG observability
Cost Optimization -- Reduce LLM spending with observability data

Prerequisites​

What You'll Learn​

Setup​

Step 1: Auto-Instrumentation (Recommended)​

Step 2: @observe Decorator (Named Traces)​

Step 3: Context Manager (Advanced)​

Step 4: Manual Recording (Legacy / Custom Providers)​

Compare Approaches​

Viewing Results in the LLM Calls Explorer​

Next Steps​