CrewAI Agent
A CrewAI-style multi-agent crew execution pipeline with a parent orchestrator coordinating 2 child agents -- a researcher and a writer -- using decorator-based observability. The crew follows a sequential process: the researcher gathers sources and assesses depth, then the writer transforms findings into a polished article with style decisions and quality scoring.
This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys.
Architecture
Key Code
Orchestrator with crew kickoff and task ordering
The parent agent coordinates the sequential crew execution. Each child agent auto-links via WaxellContext lineage.
@waxell.observe(agent_name="crewai-orchestrator", workflow_name="crew-execution")
async def run_pipeline(query: str, dry_run: bool = False, waxell_ctx=None):
waxell.tag("demo", "crewai")
waxell.tag("crew", "governance-research-crew")
waxell.metadata("num_agents", 2)
waxell.metadata("process_type", "sequential")
# @step -- crew kickoff
kickoff = await crew_kickoff(crew_config=_CREW_CONFIG, query=query)
# @decision -- assign execution order
order = await assign_task_order(crew_config=_CREW_CONFIG, query=query)
# Child agents execute sequentially (writer depends on researcher)
research_result = await run_researcher(query=query, openai_client=openai_client)
writer_result = await run_writer(
query=query, research_output=research_result["output"],
openai_client=openai_client,
)
Researcher with @tool, @retrieval, and @reasoning
The researcher agent gathers sources, ranks them, and assesses depth before generating findings via an auto-instrumented LLM call.
@waxell.tool(tool_type="web_search")
def web_research(query: str, max_sources: int = 5) -> dict:
"""Simulate web research to gather sources on a topic."""
return {"query": query, "sources_found": len(_RESEARCH_SOURCES)}
@waxell.retrieval(source="crewai-research")
def gather_sources(query: str, sources: list, top_k: int = 3) -> list[dict]:
"""Gather and rank research sources by relevance."""
sorted_sources = sorted(sources, key=lambda s: s.get("relevance", 0),
reverse=True)[:top_k]
return [{"id": s["id"], "title": s["title"], "relevance": s["relevance"]}
for s in sorted_sources]
@waxell.reasoning_dec(step="research_depth_assessment")
async def assess_research_depth(query: str, sources: list) -> dict:
avg_relevance = sum(s.get("relevance", 0) for s in sources) / max(len(sources), 1)
return {
"thought": f"Gathered {len(sources)} sources with avg relevance {avg_relevance:.2f}.",
"evidence": [f"[{s.get('relevance', 0):.2f}] {s.get('title')}" for s in sources],
"conclusion": "Research is comprehensive enough for writing",
}
Writer with @decision, decide(), @reasoning, and score()
The writer agent makes style decisions, generates content, evaluates quality, and attaches scores.
@waxell.decision(name="writing_style", options=["technical", "executive", "narrative"])
async def choose_writing_style(query: str, research_output: str) -> dict:
if "governance" in query.lower():
return {"chosen": "executive",
"reasoning": "Governance topics benefit from executive-style writing"}
return {"chosen": "narrative", "reasoning": "General topics suit narrative style"}
# Manual decide() for output format
waxell.decide(
"output_format",
chosen="structured_article",
options=["structured_article", "bullet_summary", "executive_brief"],
reasoning="'executive' style pairs best with structured article format",
confidence=0.87,
)
# Quality scores
waxell.score("content_quality", 0.90, comment="Based on source coverage")
waxell.score("factual_accuracy", 0.88, comment="Cross-referenced with research")
waxell.score("readability", 0.92, comment="Style matches target audience")
What this demonstrates
@waxell.observe-- parent-child crew hierarchy with automatic lineage@waxell.step_dec-- crew kickoff recorded as an execution step@waxell.tool-- web research operation recorded withtool_type="web_search"@waxell.retrieval-- source gathering recorded withsource="crewai-research"@waxell.reasoning_dec-- research depth and article quality assessments@waxell.decision-- task execution order and writing style selectionwaxell.decide()-- manual output format decision with options and confidencewaxell.score()-- content quality, factual accuracy, and readability scoreswaxell.tag()/waxell.metadata()-- crew name, agent roles, and process metadata- Auto-instrumented LLM calls -- two OpenAI gpt-4o-mini calls captured automatically
- Sequential crew process -- writer depends on researcher output, mimicking CrewAI's sequential mode
Run it
# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.crewai_agent --dry-run
# Live (real OpenAI)
export OPENAI_API_KEY="sk-..."
python -m app.demos.crewai_agent
# Custom query
python -m app.demos.crewai_agent --query "Write about LLM cost optimization"