CrewAI vs Waxell
This page compares three approaches to building the same agent: CrewAI alone, CrewAI enhanced with Waxell Observe, and a fully native Waxell implementation. The use case is a research agent that searches for information and produces a summary report.
A) CrewAI Alone
A standard CrewAI setup with an Agent, Task, and Crew. CrewAI provides a high-level abstraction for multi-agent orchestration, but leaves observability, cost tracking, and governance to you.
from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool
# Define tools
search_tool = SerperDevTool()
# Define the research agent
researcher = Agent(
role="Research Analyst",
goal="Find comprehensive information about the given topic",
backstory="You are an expert research analyst skilled at finding and synthesizing information.",
tools=[search_tool],
llm="gpt-4o",
verbose=True,
)
# Define the summarizer agent
summarizer = Agent(
role="Report Writer",
goal="Produce a clear, well-structured summary report",
backstory="You are a skilled technical writer who distills complex research into actionable summaries.",
llm="gpt-4o",
verbose=True,
)
# Define tasks
research_task = Task(
description="Research the topic: {topic}. Find key facts, recent developments, and expert opinions.",
expected_output="A detailed research brief with sources",
agent=researcher,
)
summary_task = Task(
description="Using the research brief, write a concise executive summary with key findings and recommendations.",
expected_output="A structured executive summary in markdown",
agent=summarizer,
)
# Build and run the crew
crew = Crew(
agents=[researcher, summarizer],
tasks=[research_task, summary_task],
verbose=True,
)
result = crew.kickoff(inputs={"topic": "AI governance trends 2025"})
print(result)
What is missing:
- No centralized tracking of LLM calls across agents
- No cost visibility (how much did this crew run cost?)
- No policy enforcement (no budget limits, no content filtering)
- No audit trail beyond verbose console output
- No durability (a crash means re-running the entire crew from scratch)
B) CrewAI + Waxell Observe
The same CrewAI workflow wrapped with WaxellContext for full observability. Your CrewAI code stays the same -- you just wrap the execution in a context manager.
from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool
from waxell_observe import WaxellContext
search_tool = SerperDevTool()
researcher = Agent(
role="Research Analyst",
goal="Find comprehensive information about the given topic",
backstory="You are an expert research analyst skilled at finding and synthesizing information.",
tools=[search_tool],
llm="gpt-4o",
verbose=True,
)
summarizer = Agent(
role="Report Writer",
goal="Produce a clear, well-structured summary report",
backstory="You are a skilled technical writer who distills complex research into actionable summaries.",
llm="gpt-4o",
verbose=True,
)
research_task = Task(
description="Research the topic: {topic}. Find key facts, recent developments, and expert opinions.",
expected_output="A detailed research brief with sources",
agent=researcher,
)
summary_task = Task(
description="Using the research brief, write a concise executive summary with key findings and recommendations.",
expected_output="A structured executive summary in markdown",
agent=summarizer,
)
crew = Crew(
agents=[researcher, summarizer],
tasks=[research_task, summary_task],
verbose=True,
)
# Wrap execution with Waxell Observe
async with WaxellContext(agent_name="research-crew") as ctx:
result = crew.kickoff(inputs={"topic": "AI governance trends 2025"})
# Record LLM calls (CrewAI exposes token usage in its output)
if hasattr(result, "token_usage"):
ctx.record_llm_call(
model="gpt-4o",
tokens_in=result.token_usage.get("prompt_tokens", 0),
tokens_out=result.token_usage.get("completion_tokens", 0),
task="research_crew",
)
# Record execution steps
ctx.record_step("research", output={"status": "complete"})
ctx.record_step("summarize", output={"status": "complete"})
ctx.set_result({"summary": str(result)})
You can also use the @waxell_agent decorator for simpler CrewAI setups:
from waxell_observe import waxell_agent
@waxell_agent(agent_name="research-crew")
async def run_research(topic: str, waxell_ctx=None) -> str:
crew = Crew(agents=[researcher, summarizer], tasks=[research_task, summary_task])
result = crew.kickoff(inputs={"topic": topic})
if waxell_ctx and hasattr(result, "token_usage"):
waxell_ctx.record_llm_call(
model="gpt-4o",
tokens_in=result.token_usage.get("prompt_tokens", 0),
tokens_out=result.token_usage.get("completion_tokens", 0),
)
return str(result)
What you now get -- with minimal changes to your CrewAI code:
- Every crew execution tracked as a run with inputs, outputs, and status
- LLM call recording with cost estimates
- Step-by-step execution trail for debugging
- Pre-execution policy checks (budget enforcement, rate limiting)
- Full audit trail visible in the Waxell dashboard
C) Waxell Native
The same research-and-summarize use case built natively with the Waxell SDK. The workflow, tools, and LLM calls are all governed and tracked by default.
from waxell_sdk import agent, workflow, tool, WorkflowContext
@agent(
name="research-agent",
description="Researches topics and produces executive summaries",
signals=["research_request"],
domains=["search"],
)
class ResearchAgent:
@tool
async def web_search(self, ctx: WorkflowContext, query: str) -> dict:
"""Search the web for information on a topic."""
return await ctx.domain("search", "web_search", query=query)
@workflow("research_and_summarize")
async def research_and_summarize(self, ctx: WorkflowContext, topic: str) -> dict:
"""Research a topic and produce an executive summary."""
# Step 1: Search for information
search_results = await ctx.tool("web_search", query=topic)
ctx.log_step("research_complete", {"results_count": len(search_results)})
# Step 2: Synthesize research into a brief
research_brief = await ctx.llm.generate(
prompt=(
f"Synthesize these search results into a research brief:\n"
f"Topic: {topic}\n"
f"Results: {search_results}"
),
output_format="text",
task="research_synthesis",
)
# Step 3: Generate executive summary
summary = await ctx.llm.generate(
prompt=(
f"Write a concise executive summary with key findings "
f"and recommendations based on this research brief:\n\n"
f"{research_brief}"
),
output_format="json",
task="executive_summary",
)
return summary
What you gain with native Waxell:
- Single-agent simplicity: No need to define separate "agents" for research and summarization -- workflows handle orchestration
- Durable execution: Each step is a checkpoint. If the process crashes after the search completes, it resumes at the synthesis step
- LLM routing:
ctx.llm.generate()automatically selects models based on task type, applies rate limiting, and handles fallbacks - Domain abstraction:
ctx.domain("search", "web_search", ...)routes through a governed endpoint, not a hardcoded API call - Zero instrumentation: Every LLM call, tool invocation, and step is tracked in the control plane automatically
Comparison Table
| Capability | CrewAI | CrewAI + Observe | Waxell Native |
|---|---|---|---|
| Agent definition | Agent/Task/Crew classes | Unchanged | Declarative (@agent, @workflow, @tool) |
| Multi-agent orchestration | Built-in (Crew) | Unchanged | Workflows with steps |
| Observability | Verbose console output | Run tracking via WaxellContext | Built-in, zero instrumentation |
| LLM cost tracking | Not included | Manual recording with auto-estimation | Built-in with tenant-level overrides |
| Policy enforcement | Not included | Pre-execution checks | Full lifecycle governance |
| Budget limits | Not included | Supported via policies | Built-in with tenant/agent scoping |
| Durable workflows | Not included | Not included | Checkpoint/resume with WorkflowEnvelope |
| Approval workflows | Not included | Not included | Built-in with pause/resume |
| Multi-tenancy | Not included | Tenant-scoped via control plane | Native tenant isolation |
| Audit logging | Not included | Run-level audit trail | Full execution trace with agent_trace |
Which Approach Should You Choose?
If you already have CrewAI agents running, start with CrewAI + Observe. Wrap your crew.kickoff() calls with WaxellContext and you immediately get cost tracking, policy enforcement, and audit trails. Migration to native Waxell can happen later when you need durable workflows.
See the Progressive Migration guide for a phased approach to adopting Waxell.