CrewAI Agent

A CrewAI-style multi-agent crew execution pipeline with a parent orchestrator coordinating 2 child agents -- a researcher and a writer -- using decorator-based observability. The crew follows a sequential process: the researcher gathers sources and assesses depth, then the writer transforms findings into a polished article with style decisions and quality scoring.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.

Architecture

Key Code

Orchestrator with crew kickoff and task ordering

The parent agent coordinates the sequential crew execution. Each child agent auto-links via WaxellContext lineage.

@waxell.observe(agent_name="crewai-orchestrator", workflow_name="crew-execution")
async def run_pipeline(query: str, dry_run: bool = False):
    waxell.tag("demo", "crewai")
    waxell.tag("crew", "governance-research-crew")
    waxell.metadata("num_agents", 2)
    waxell.metadata("process_type", "sequential")

    # @step -- crew kickoff
    kickoff = await crew_kickoff(crew_config=_CREW_CONFIG, query=query)

    # @decision -- assign execution order
    order = await assign_task_order(crew_config=_CREW_CONFIG, query=query)

    # Child agents execute sequentially (writer depends on researcher)
    research_result = await run_researcher(query=query, openai_client=openai_client)
    writer_result = await run_writer(
        query=query, research_output=research_result["output"],
        openai_client=openai_client,
    )

Researcher with `@tool`, `@retrieval`, and `@reasoning`

The researcher agent gathers sources, ranks them, and assesses depth before generating findings via an auto-instrumented LLM call.

@waxell.tool(tool_type="web_search")
def web_research(query: str, max_sources: int = 5) -> dict:
    """Simulate web research to gather sources on a topic."""
    return {"query": query, "sources_found": len(_RESEARCH_SOURCES)}

@waxell.retrieval(source="crewai-research")
def gather_sources(query: str, sources: list, top_k: int = 3) -> list[dict]:
    """Gather and rank research sources by relevance."""
    sorted_sources = sorted(sources, key=lambda s: s.get("relevance", 0),
                            reverse=True)[:top_k]
    return [{"id": s["id"], "title": s["title"], "relevance": s["relevance"]}
            for s in sorted_sources]

@waxell.reasoning_dec(step="research_depth_assessment")
async def assess_research_depth(query: str, sources: list) -> dict:
    avg_relevance = sum(s.get("relevance", 0) for s in sources) / max(len(sources), 1)
    return {
        "thought": f"Gathered {len(sources)} sources with avg relevance {avg_relevance:.2f}.",
        "evidence": [f"[{s.get('relevance', 0):.2f}] {s.get('title')}" for s in sources],
        "conclusion": "Research is comprehensive enough for writing",
    }

Writer with `@decision`, `decide()`, `@reasoning`, and `score()`

The writer agent makes style decisions, generates content, evaluates quality, and attaches scores.

@waxell.decision(name="writing_style", options=["technical", "executive", "narrative"])
async def choose_writing_style(query: str, research_output: str) -> dict:
    if "governance" in query.lower():
        return {"chosen": "executive",
                "reasoning": "Governance topics benefit from executive-style writing"}
    return {"chosen": "narrative", "reasoning": "General topics suit narrative style"}

# Manual decide() for output format
waxell.decide(
    "output_format",
    chosen="structured_article",
    options=["structured_article", "bullet_summary", "executive_brief"],
    reasoning="'executive' style pairs best with structured article format",
    confidence=0.87,
)

# Quality scores
waxell.score("content_quality", 0.90, comment="Based on source coverage")
waxell.score("factual_accuracy", 0.88, comment="Cross-referenced with research")
waxell.score("readability", 0.92, comment="Style matches target audience")

What this demonstrates

@waxell.observe -- parent-child crew hierarchy with automatic lineage
@waxell.step_dec -- crew kickoff recorded as an execution step
@waxell.tool -- web research operation recorded with tool_type="web_search"
@waxell.retrieval -- source gathering recorded with source="crewai-research"
@waxell.reasoning_dec -- research depth and article quality assessments
@waxell.decision -- task execution order and writing style selection
waxell.decide() -- manual output format decision with options and confidence
waxell.score() -- content quality, factual accuracy, and readability scores
waxell.tag() / waxell.metadata() -- crew name, agent roles, and process metadata
Auto-instrumented LLM calls -- two OpenAI gpt-4o-mini calls captured automatically
Sequential crew process -- writer depends on researcher output, mimicking CrewAI's sequential mode

Run it

# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.crewai_agent --dry-run

# Live (real OpenAI)
export OPENAI_API_KEY="sk-..."
python -m app.demos.crewai_agent

# Custom query
python -m app.demos.crewai_agent --query "Write about LLM cost optimization"

Source

dev/waxell-dev/app/demos/crewai_agent.py

Architecture​

Key Code​

Orchestrator with crew kickoff and task ordering​

Researcher with @tool, @retrieval, and @reasoning​

Writer with @decision, decide(), @reasoning, and score()​

What this demonstrates​

Run it​

Source​