Groq
A multi-provider pipeline combining Groq's ultra-fast Llama models with OpenAI function calling. The parent orchestrator coordinates two child agents: a Groq analyzer that classifies and synthesizes using llama-3.3-70b-versatile, and a function caller that uses OpenAI's gpt-4o-mini with tool definitions for web search and calculator operations. Demonstrates cross-provider tracing in a single session.
This example requires GROQ_API_KEY, OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.
Architecture
Key Code
Function calling with tool execution loop
The function-caller child agent sends a query with tool definitions, executes requested tools, and synthesizes the final answer with tool results.
@waxell.observe(agent_name="function-caller", workflow_name="function-calling", capture_io=True)
async def run_function_caller(query: str, openai_client, *, dry_run=False, waxell_ctx=None) -> dict:
waxell.tag("provider", "openai")
waxell.tag("task", "function_calling")
messages = [
{"role": "system", "content": "You are a research assistant with access to web search and calculator tools."},
{"role": "user", "content": query},
]
# Call 1: Send with tools
tool_response = await openai_client.chat.completions.create(
model="gpt-4o-mini", messages=messages, tools=TOOL_DEFINITIONS,
)
assistant_msg = tool_response.choices[0].message
tool_results = []
if assistant_msg.tool_calls:
for tc in assistant_msg.tool_calls:
args = json.loads(tc.function.arguments)
if tc.function.name == "web_search":
result = web_search(**args)
elif tc.function.name == "calculator":
result = calculator(**args)
tool_results.append(result)
# Call 2: Final synthesis with tool results
final_response = await openai_client.chat.completions.create(
model="gpt-4o-mini", messages=messages, tools=TOOL_DEFINITIONS,
)
final_answer = final_response.choices[0].message.content
waxell.score("function_calling_quality", 0.85)
return {"answer": final_answer, "tools_used": len(tool_results)}
Tool definitions with @waxell.tool
Each tool is decorated for automatic span creation with timing and I/O capture.
@waxell.tool(tool_type="function", name="web_search")
def web_search(query: str, max_results: int = 2) -> dict:
"""Search the web for information on a topic."""
return {
"results": [
{"title": "AI Safety Research Overview 2025", "url": "https://example.com/safety",
"snippet": "Recent advances include RLHF improvements and automated red-teaming."},
],
"total": 2,
}
@waxell.tool(tool_type="function", name="calculator")
def calculator(operation: str, values: list) -> dict:
"""Perform mathematical calculations."""
if operation == "average":
result = sum(values) / len(values) if values else 0
elif operation == "add":
result = sum(values)
return {"result": result}
What this demonstrates
@waxell.observe-- parent-child hierarchy across two LLM providers@waxell.tool-- typed tool operations (web_search,calculator)@waxell.step_dec-- query preprocessing step@waxell.decision-- provider strategy and response style decisions@waxell.reasoning_dec-- answer quality assessmentwaxell.tag()-- multi-provider tagging (groq,openai)waxell.score()-- quality scores per stage- Auto-instrumented LLM calls -- both Groq and OpenAI calls traced automatically
Run it
cd dev/waxell-dev
python -m app.demos.groq_agent --dry-run