Groq

A multi-provider pipeline combining Groq's ultra-fast Llama models with OpenAI function calling. The parent orchestrator coordinates two child agents: a Groq analyzer that classifies and synthesizes using llama-3.3-70b-versatile, and a function caller that uses OpenAI's gpt-4o-mini with tool definitions for web search and calculator operations. Demonstrates cross-provider tracing in a single session.

Environment variables

This example requires GROQ_API_KEY, OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Function calling with tool execution loop

The function-caller child agent sends a query with tool definitions, executes requested tools, and synthesizes the final answer with tool results.

@waxell.observe(agent_name="function-caller", workflow_name="function-calling", capture_io=True)
async def run_function_caller(query: str, openai_client, *, dry_run=False, waxell_ctx=None) -> dict:
    waxell.tag("provider", "openai")
    waxell.tag("task", "function_calling")

    messages = [
        {"role": "system", "content": "You are a research assistant with access to web search and calculator tools."},
        {"role": "user", "content": query},
    ]

    # Call 1: Send with tools
    tool_response = await openai_client.chat.completions.create(
        model="gpt-4o-mini", messages=messages, tools=TOOL_DEFINITIONS,
    )
    assistant_msg = tool_response.choices[0].message
    tool_results = []

    if assistant_msg.tool_calls:
        for tc in assistant_msg.tool_calls:
            args = json.loads(tc.function.arguments)
            if tc.function.name == "web_search":
                result = web_search(**args)
            elif tc.function.name == "calculator":
                result = calculator(**args)
            tool_results.append(result)

        # Call 2: Final synthesis with tool results
        final_response = await openai_client.chat.completions.create(
            model="gpt-4o-mini", messages=messages, tools=TOOL_DEFINITIONS,
        )
        final_answer = final_response.choices[0].message.content

    waxell.score("function_calling_quality", 0.85)
    return {"answer": final_answer, "tools_used": len(tool_results)}

Tool definitions with `@waxell.tool`

Each tool is decorated for automatic span creation with timing and I/O capture.

@waxell.tool(tool_type="function", name="web_search")
def web_search(query: str, max_results: int = 2) -> dict:
    """Search the web for information on a topic."""
    return {
        "results": [
            {"title": "AI Safety Research Overview 2025", "url": "https://example.com/safety",
             "snippet": "Recent advances include RLHF improvements and automated red-teaming."},
        ],
        "total": 2,
    }


@waxell.tool(tool_type="function", name="calculator")
def calculator(operation: str, values: list) -> dict:
    """Perform mathematical calculations."""
    if operation == "average":
        result = sum(values) / len(values) if values else 0
    elif operation == "add":
        result = sum(values)
    return {"result": result}

What this demonstrates

@waxell.observe -- parent-child hierarchy across two LLM providers
@waxell.tool -- typed tool operations (web_search, calculator)
@waxell.step_dec -- query preprocessing step
@waxell.decision -- provider strategy and response style decisions
@waxell.reasoning_dec -- answer quality assessment
waxell.tag() -- multi-provider tagging (groq, openai)
waxell.score() -- quality scores per stage
Auto-instrumented LLM calls -- both Groq and OpenAI calls traced automatically

Run it

cd dev/waxell-dev
python -m app.demos.groq_agent --dry-run

Source

dev/waxell-dev/app/demos/groq_agent.py

Architecture​

Key Code​

Function calling with tool execution loop​

Tool definitions with @waxell.tool​

What this demonstrates​

Run it​

Source​