Skip to main content

Scoring & Enrichment

The definitive showcase of every convenience function: scores, tags, metadata, decisions, reasoning steps, and execution steps -- all recorded with one-liner calls.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL.

import asyncio

import waxell_observe as waxell

waxell.init()

from openai import OpenAI

client = OpenAI()


@waxell.observe(agent_name="enrichment-demo")
async def analyze_and_enrich(text: str, waxell_ctx=None) -> dict:
# --- Tags (string key-value, searchable in Grafana TraceQL) ---
waxell.tag("environment", "production")
waxell.tag("pipeline", "content-analysis")

# --- Metadata (any JSON-serializable value) ---
waxell.metadata("input_length", len(text))
waxell.metadata("config", {"temperature": 0.7, "max_tokens": 500})

# --- Step (record an execution milestone) ---
waxell.step("preprocess", output={"char_count": len(text)})

# --- LLM call (auto-instrumented) ---
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"Analyze the sentiment and quality of: {text}"}
],
)
analysis = response.choices[0].message.content

# --- Scores (numeric, boolean, categorical) ---
waxell.score("relevance", 0.92)
waxell.score("coherence", 0.88, comment="auto-scored by pipeline")
waxell.score("contains_pii", False, data_type="boolean")
waxell.score("sentiment", "positive", data_type="categorical")

# --- Decide (record a decision with options and reasoning) ---
waxell.decide(
"response_strategy",
chosen="detailed",
options=["brief", "detailed", "bullet_points"],
reasoning="High relevance score warrants a detailed response",
confidence=0.85,
)

# --- Reason (record a chain-of-thought step) ---
waxell.reason(
step="quality_check",
thought="Analysis covers sentiment and style but not factual claims",
evidence=["relevance=0.92", "coherence=0.88", "no PII detected"],
conclusion="Content is safe and high quality",
)

return {"analysis": analysis}


asyncio.run(
analyze_and_enrich("Waxell makes AI agent observability simple and powerful.")
)

What this demonstrates

  • waxell.score() -- record numeric (0.92), boolean (False), or categorical ("positive") scores with an optional comment.
  • waxell.tag() -- attach string key-value pairs searchable in Grafana TraceQL queries.
  • waxell.metadata() -- attach arbitrary JSON-serializable data (strings, numbers, dicts) for context.
  • waxell.step() -- mark an execution milestone with an optional output dict.
  • waxell.decide() -- record a decision: what was chosen, what the options were, why, and how confident.
  • waxell.reason() -- record a chain-of-thought step: the thought process, supporting evidence, and conclusion.
  • @waxell.observe -- named agent trace with automatic lifecycle management and injected waxell_ctx.

All six convenience functions are top-level calls that work anywhere inside an @waxell.observe or WaxellContext scope. They are no-ops if called outside a context, so they are safe to leave in production code.

Run it

export OPENAI_API_KEY="sk-..."
export WAXELL_API_KEY="your-waxell-api-key"
export WAXELL_API_URL="https://api.waxell.ai"

python scoring_enrichment.py