Research Agent

A multi-agent research pipeline that coordinates a research-searcher (web search, document retrieval, relevance scoring, source reasoning) and a research-synthesizer (LLM synthesis, quality assessment, output format decision). Built with OpenAI and waxell-observe decorator patterns.

Environment variables

This example runs in dry-run mode by default (no API key needed). For live mode, set OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL.

Architecture

Key Code

Retrieval and tool decorators

The searcher agent combines document retrieval with web search and computes relevance statistics.

@waxell.retrieval(source="document_store")
def retrieve_and_rank(query: str) -> list[dict]:
    """Retrieve and rank documents from the knowledge base."""
    retrieved = retrieve_documents(query)
    ranked = []
    for i, doc in enumerate(retrieved):
        score = round(0.95 - (i * 0.12), 2)
        ranked.append({"id": doc["id"], "title": doc["title"], "score": score})
    return ranked

@waxell.tool(tool_type="api")
def web_search(query: str, max_results: int = 5) -> dict:
    """Execute a web search for the given query."""
    return {"results": [...], "total_found": 3}

@waxell.tool(tool_type="function")
def calculate_relevance(scores: list) -> dict:
    """Calculate average relevance from a list of scores."""
    avg_score = round(sum(scores) / len(scores), 2)
    return {"average": avg_score, "count": len(scores)}

Three-step reasoning chain with inline decision

The searcher evaluates sources, checks consistency, identifies gaps, then makes a decision about whether additional research is needed using waxell.decide().

@waxell.reasoning_dec(step="evaluate_sources")
async def evaluate_sources(documents: list, web_results: list) -> dict:
    return {
        "thought": "Document 1 covers safety guardrails comprehensively...",
        "evidence": [f"doc-{d['id']}" for d in documents[:3]],
        "conclusion": "Strong foundation with multi-source corroboration",
    }

# Inline decision (no decorator -- uses top-level convenience function)
waxell.decide(
    "additional_research",
    chosen="sufficient",
    options=["sufficient", "expand_search", "expert_review"],
    reasoning="Source quality meets threshold (avg relevance 0.85)",
    confidence=0.88,
)

Synthesizer with quality assessment

The synthesizer generates a research answer via OpenAI (auto-instrumented), then assesses its quality and decides on output format.

@waxell.observe(agent_name="research-synthesizer", workflow_name="research-synthesis")
async def run_synthesizer(query, documents, web_results, openai_client, waxell_ctx=None):
    waxell.tag("agent_role", "synthesizer")
    waxell.tag("provider", "openai")

    response = await openai_client.chat.completions.create(...)  # auto-instrumented
    answer = response.choices[0].message.content

    quality = await assess_answer_quality(answer, documents, web_results)  # @reasoning
    format_result = choose_output_format(len(documents), len(web_results))  # @decision

    waxell.score("research_quality", 0.87, comment="Source coverage")
    waxell.score("factual_grounding", True, data_type="boolean")
    return {"answer": answer}

What this demonstrates

@waxell.decision -- query classification with LLM call (auto-instrumented OpenAI inside the decision decorator).
waxell.decide() -- inline decisions for research strategy and additional research, recorded without decorators.
@waxell.retrieval(source="document_store") -- document retrieval with ranking scores, recorded with source attribution.
@waxell.tool(tool_type="api") and @waxell.tool(tool_type="function") -- web search (API) and relevance calculation (pure function) both auto-recorded.
@waxell.reasoning_dec -- three-step reasoning chain (evaluate sources, check consistency, identify gaps) building structured evidence.
waxell.score() with data_type -- numeric quality score plus boolean factual grounding score.
Nested @waxell.observe -- orchestrator is parent; research-searcher and research-synthesizer are child agents with automatic lineage.
PolicyViolationError handling -- the orchestrator catches governance violations gracefully without crashing.

Run it

# Dry-run (no API key needed)
python -m app.demos.research_agent --dry-run

# Live mode with OpenAI
OPENAI_API_KEY=sk-... python -m app.demos.research_agent

Source

dev/waxell-dev/app/demos/research_agent.py

Architecture​

Key Code​

Retrieval and tool decorators​

Three-step reasoning chain with inline decision​

Synthesizer with quality assessment​

What this demonstrates​

Run it​

Source​