Skip to main content

Google ADK

A Google Agent Development Kit (ADK) pipeline with sub-agents, tool execution, and multi-agent lineage across 3 agents. The orchestrator parses research requests and selects strategies, the runner executes web_search and document_fetcher tools with finding ranking, and the evaluator produces a structured comparison with quality scores.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Runner with research tools and @retrieval ranking

The runner searches the web, fetches documents, and ranks findings by relevance.

@waxell.tool(tool_type="function")
def web_search(query: str, max_results: int = 5) -> list:
"""Search the web for relevant governance frameworks."""
return MOCK_SEARCH_RESULTS[:max_results]

@waxell.tool(tool_type="function")
def document_fetcher(urls: list) -> list:
"""Fetch and parse documents from URLs."""
return MOCK_DOCUMENTS

@waxell.retrieval(source="web_search")
def rank_findings(query: str, findings: list) -> list[dict]:
"""Rank research findings by relevance to the query."""
for finding in findings:
query_words = set(query.lower().split())
content_words = set(finding.get("content", "").lower().split())
finding["relevance_score"] = round(len(query_words & content_words)
/ max(len(query_words), 1), 4)
return sorted(findings, key=lambda d: d["relevance_score"], reverse=True)

Evaluator with @reasoning and structured comparison

The evaluator assesses analysis quality and generates a structured comparison.

@waxell.reasoning_dec(step="analysis_assessment")
async def assess_analysis(analysis_text: str, sources: list) -> dict:
source_titles = [s.get("title", "unknown") for s in sources]
referenced = sum(1 for t in source_titles
if t.lower().replace(" ", "") in analysis_text.lower().replace(" ", ""))
return {
"thought": f"Analysis references {referenced}/{len(sources)} source documents.",
"evidence": [f"Source: {t}" for t in source_titles],
"conclusion": "Analysis provides adequate coverage" if referenced > 0
else "Analysis may need more source grounding",
}

waxell.score("analysis_quality", 0.90, comment="source coverage and structure")
waxell.score("source_coverage", len(documents) / max(len(MOCK_DOCUMENTS), 1),
data_type="float", comment="fraction of available sources used")

What this demonstrates

  • @waxell.observe -- parent-child agent hierarchy with automatic lineage
  • @waxell.step_dec -- research request parsing recorded as step
  • @waxell.tool -- web search and document fetching tools
  • @waxell.retrieval -- finding ranking with source="web_search"
  • @waxell.decision -- research strategy selection via OpenAI
  • waxell.decide() -- manual sub-agent routing decision
  • @waxell.reasoning_dec -- analysis quality assessment
  • waxell.score() -- analysis quality and source coverage scores
  • Auto-instrumented LLM calls -- research synthesis and comparison calls captured
  • Google ADK pattern -- orchestrator with sub-agents in sequential dependency chain

Run it

# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.google_adk_agent --dry-run

# Live (real OpenAI)
export OPENAI_API_KEY="sk-..."
python -m app.demos.google_adk_agent

# Custom query
python -m app.demos.google_adk_agent --query "Compare NIST and EU AI Act frameworks"

Source

dev/waxell-dev/app/demos/google_adk_agent.py