OpenAI Agents SDK

A multi-agent pipeline simulating OpenAI Agents SDK patterns with Runner, triage agent, and specialist handoff. The orchestrator prepares a Runner configuration, classifies the request type via an LLM-powered decision, dispatches a runner child agent that performs triage classification and specialist handoff, then dispatches an evaluator child agent that assesses analysis quality and generates a structured security report.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Runner with triage and handoff

The runner child agent simulates the OpenAI Agents SDK Runner.run pattern: triage classification, waxell.decide() for handoff, and specialist agent execution.

@waxell.observe(agent_name="openai-agents-runner", workflow_name="openai-agents-execution")
async def run_agent_execution(query: str, openai_client, runner_config: dict, waxell_ctx=None):
    waxell.tag("agent_role", "runner")
    waxell.tag("framework", "openai_agents")

    # Triage agent classifies via @tool
    triage_result = triage_classify(query=query, available_agents=AVAILABLE_AGENTS)
    target_agent = triage_result["target"]

    # Triage LLM call
    response1 = await openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": "You are a triage agent."}, {"role": "user", "content": query}],
    )

    # Handoff decision via waxell.decide()
    waxell.decide(
        "agent_handoff", chosen=target_agent, options=AVAILABLE_AGENTS,
        reasoning=f"Triage classified with {triage_result.get('confidence', 0.9):.0%} confidence",
        confidence=triage_result.get("confidence", 0.9),
    )

    # Specialist agent execution
    response2 = await openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": f"You are a {target_agent.replace('_', ' ')}."}, ...],
    )
    return {"specialist_output": response2.choices[0].message.content, "agents_executed": ["triage_agent", target_agent]}

Evaluator with reasoning and report formatting

The evaluator child agent assesses quality with @reasoning and formats a structured report with @tool.

@waxell.observe(agent_name="openai-agents-evaluator", workflow_name="openai-agents-evaluation")
async def run_agent_evaluation(query: str, runner_result: dict, openai_client, waxell_ctx=None):
    waxell.tag("agent_role", "evaluator")

    # Quality assessment via @reasoning
    quality = await evaluate_analysis_quality(
        analysis=runner_result["specialist_output"],
        agents_used=runner_result["agents_executed"],
        handoff_count=runner_result["handoffs"],
    )

    # Report formatting via @tool
    report = format_security_report(
        analysis=runner_result["specialist_output"],
        agents_used=runner_result["agents_executed"],
        handoff_count=runner_result["handoffs"],
    )

    waxell.score("analysis_depth", 0.90)
    waxell.score("handoff_efficiency", True, data_type="boolean")
    return {"quality": quality, "report": report}

What this demonstrates

@waxell.observe -- parent orchestrator with 2 child agents
@waxell.step_dec -- runner config preparation
@waxell.decision -- LLM-powered request type classification
waxell.decide() -- manual inline decision for agent handoff
@waxell.tool -- triage classification and report formatting
@waxell.reasoning_dec -- analysis quality assessment
waxell.score() -- depth scores and boolean efficiency markers
OpenAI Agents SDK patterns -- Runner, triage, handoff, and specialist agents

Run it

cd dev/waxell-dev
python -m app.demos.openai_agents_agent --dry-run

Source

dev/waxell-dev/app/demos/openai_agents_agent.py

Architecture​

Key Code​

Runner with triage and handoff​

Evaluator with reasoning and report formatting​

What this demonstrates​

Run it​

Source​