Scoring & Enrichment
The definitive showcase of every convenience function: scores, tags, metadata, decisions, reasoning steps, and execution steps -- all recorded with one-liner calls.
Environment variables
This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL.
import asyncio
import waxell_observe as waxell
waxell.init()
from openai import OpenAI
client = OpenAI()
@waxell.observe(agent_name="enrichment-demo")
async def analyze_and_enrich(text: str, waxell_ctx=None) -> dict:
# --- Tags (string key-value, searchable in Grafana TraceQL) ---
waxell.tag("environment", "production")
waxell.tag("pipeline", "content-analysis")
# --- Metadata (any JSON-serializable value) ---
waxell.metadata("input_length", len(text))
waxell.metadata("config", {"temperature": 0.7, "max_tokens": 500})
# --- Step (record an execution milestone) ---
waxell.step("preprocess", output={"char_count": len(text)})
# --- LLM call (auto-instrumented) ---
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": f"Analyze the sentiment and quality of: {text}"}
],
)
analysis = response.choices[0].message.content
# --- Scores (numeric, boolean, categorical) ---
waxell.score("relevance", 0.92)
waxell.score("coherence", 0.88, comment="auto-scored by pipeline")
waxell.score("contains_pii", False, data_type="boolean")
waxell.score("sentiment", "positive", data_type="categorical")
# --- Decide (record a decision with options and reasoning) ---
waxell.decide(
"response_strategy",
chosen="detailed",
options=["brief", "detailed", "bullet_points"],
reasoning="High relevance score warrants a detailed response",
confidence=0.85,
)
# --- Reason (record a chain-of-thought step) ---
waxell.reason(
step="quality_check",
thought="Analysis covers sentiment and style but not factual claims",
evidence=["relevance=0.92", "coherence=0.88", "no PII detected"],
conclusion="Content is safe and high quality",
)
return {"analysis": analysis}
asyncio.run(
analyze_and_enrich("Waxell makes AI agent observability simple and powerful.")
)
What this demonstrates
waxell.score()-- record numeric (0.92), boolean (False), or categorical ("positive") scores with an optionalcomment.waxell.tag()-- attach string key-value pairs searchable in Grafana TraceQL queries.waxell.metadata()-- attach arbitrary JSON-serializable data (strings, numbers, dicts) for context.waxell.step()-- mark an execution milestone with an optional output dict.waxell.decide()-- record a decision: what was chosen, what the options were, why, and how confident.waxell.reason()-- record a chain-of-thought step: the thought process, supporting evidence, and conclusion.@waxell.observe-- named agent trace with automatic lifecycle management and injectedwaxell_ctx.
All six convenience functions are top-level calls that work anywhere inside an @waxell.observe or WaxellContext scope. They are no-ops if called outside a context, so they are safe to leave in production code.
Run it
export OPENAI_API_KEY="sk-..."
export WAXELL_API_KEY="your-waxell-api-key"
export WAXELL_API_URL="https://api.waxell.ai"
python scoring_enrichment.py