Skip to main content

CrewAI vs Waxell

This page compares three approaches to building the same agent: CrewAI alone, CrewAI enhanced with Waxell Observe, and a fully native Waxell implementation. The use case is a research agent that searches for information and produces a summary report.


A) CrewAI Alone

A standard CrewAI setup with an Agent, Task, and Crew. CrewAI provides a high-level abstraction for multi-agent orchestration, but leaves observability, cost tracking, and governance to you.

from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool

# Define tools
search_tool = SerperDevTool()

# Define the research agent
researcher = Agent(
role="Research Analyst",
goal="Find comprehensive information about the given topic",
backstory="You are an expert research analyst skilled at finding and synthesizing information.",
tools=[search_tool],
llm="gpt-4o",
verbose=True,
)

# Define the summarizer agent
summarizer = Agent(
role="Report Writer",
goal="Produce a clear, well-structured summary report",
backstory="You are a skilled technical writer who distills complex research into actionable summaries.",
llm="gpt-4o",
verbose=True,
)

# Define tasks
research_task = Task(
description="Research the topic: {topic}. Find key facts, recent developments, and expert opinions.",
expected_output="A detailed research brief with sources",
agent=researcher,
)

summary_task = Task(
description="Using the research brief, write a concise executive summary with key findings and recommendations.",
expected_output="A structured executive summary in markdown",
agent=summarizer,
)

# Build and run the crew
crew = Crew(
agents=[researcher, summarizer],
tasks=[research_task, summary_task],
verbose=True,
)

result = crew.kickoff(inputs={"topic": "AI governance trends 2025"})
print(result)

What is missing:

  • No centralized tracking of LLM calls across agents
  • No cost visibility (how much did this crew run cost?)
  • No policy enforcement (no budget limits, no content filtering)
  • No audit trail beyond verbose console output
  • No durability (a crash means re-running the entire crew from scratch)

B) CrewAI + Waxell Observe

The same CrewAI workflow wrapped with WaxellContext for full observability. Your CrewAI code stays the same -- you just wrap the execution in a context manager.

from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool
from waxell_observe import WaxellContext

search_tool = SerperDevTool()

researcher = Agent(
role="Research Analyst",
goal="Find comprehensive information about the given topic",
backstory="You are an expert research analyst skilled at finding and synthesizing information.",
tools=[search_tool],
llm="gpt-4o",
verbose=True,
)

summarizer = Agent(
role="Report Writer",
goal="Produce a clear, well-structured summary report",
backstory="You are a skilled technical writer who distills complex research into actionable summaries.",
llm="gpt-4o",
verbose=True,
)

research_task = Task(
description="Research the topic: {topic}. Find key facts, recent developments, and expert opinions.",
expected_output="A detailed research brief with sources",
agent=researcher,
)

summary_task = Task(
description="Using the research brief, write a concise executive summary with key findings and recommendations.",
expected_output="A structured executive summary in markdown",
agent=summarizer,
)

crew = Crew(
agents=[researcher, summarizer],
tasks=[research_task, summary_task],
verbose=True,
)

# Wrap execution with Waxell Observe
async with WaxellContext(agent_name="research-crew") as ctx:
result = crew.kickoff(inputs={"topic": "AI governance trends 2025"})

# Record LLM calls (CrewAI exposes token usage in its output)
if hasattr(result, "token_usage"):
ctx.record_llm_call(
model="gpt-4o",
tokens_in=result.token_usage.get("prompt_tokens", 0),
tokens_out=result.token_usage.get("completion_tokens", 0),
task="research_crew",
)

# Record execution steps
ctx.record_step("research", output={"status": "complete"})
ctx.record_step("summarize", output={"status": "complete"})

ctx.set_result({"summary": str(result)})
Decorator alternative

You can also use the @waxell_agent decorator for simpler CrewAI setups:

from waxell_observe import waxell_agent

@waxell_agent(agent_name="research-crew")
async def run_research(topic: str, waxell_ctx=None) -> str:
crew = Crew(agents=[researcher, summarizer], tasks=[research_task, summary_task])
result = crew.kickoff(inputs={"topic": topic})

if waxell_ctx and hasattr(result, "token_usage"):
waxell_ctx.record_llm_call(
model="gpt-4o",
tokens_in=result.token_usage.get("prompt_tokens", 0),
tokens_out=result.token_usage.get("completion_tokens", 0),
)

return str(result)

What you now get -- with minimal changes to your CrewAI code:

  • Every crew execution tracked as a run with inputs, outputs, and status
  • LLM call recording with cost estimates
  • Step-by-step execution trail for debugging
  • Pre-execution policy checks (budget enforcement, rate limiting)
  • Full audit trail visible in the Waxell dashboard

C) Waxell Native

The same research-and-summarize use case built natively with the Waxell SDK. The workflow, tools, and LLM calls are all governed and tracked by default.

from waxell_sdk import agent, workflow, tool, WorkflowContext

@agent(
name="research-agent",
description="Researches topics and produces executive summaries",
signals=["research_request"],
domains=["search"],
)
class ResearchAgent:

@tool
async def web_search(self, ctx: WorkflowContext, query: str) -> dict:
"""Search the web for information on a topic."""
return await ctx.domain("search", "web_search", query=query)

@workflow("research_and_summarize")
async def research_and_summarize(self, ctx: WorkflowContext, topic: str) -> dict:
"""Research a topic and produce an executive summary."""

# Step 1: Search for information
search_results = await ctx.tool("web_search", query=topic)
ctx.log_step("research_complete", {"results_count": len(search_results)})

# Step 2: Synthesize research into a brief
research_brief = await ctx.llm.generate(
prompt=(
f"Synthesize these search results into a research brief:\n"
f"Topic: {topic}\n"
f"Results: {search_results}"
),
output_format="text",
task="research_synthesis",
)

# Step 3: Generate executive summary
summary = await ctx.llm.generate(
prompt=(
f"Write a concise executive summary with key findings "
f"and recommendations based on this research brief:\n\n"
f"{research_brief}"
),
output_format="json",
task="executive_summary",
)

return summary

What you gain with native Waxell:

  • Single-agent simplicity: No need to define separate "agents" for research and summarization -- workflows handle orchestration
  • Durable execution: Each step is a checkpoint. If the process crashes after the search completes, it resumes at the synthesis step
  • LLM routing: ctx.llm.generate() automatically selects models based on task type, applies rate limiting, and handles fallbacks
  • Domain abstraction: ctx.domain("search", "web_search", ...) routes through a governed endpoint, not a hardcoded API call
  • Zero instrumentation: Every LLM call, tool invocation, and step is tracked in the control plane automatically

Comparison Table

CapabilityCrewAICrewAI + ObserveWaxell Native
Agent definitionAgent/Task/Crew classesUnchangedDeclarative (@agent, @workflow, @tool)
Multi-agent orchestrationBuilt-in (Crew)UnchangedWorkflows with steps
ObservabilityVerbose console outputRun tracking via WaxellContextBuilt-in, zero instrumentation
LLM cost trackingNot includedManual recording with auto-estimationBuilt-in with tenant-level overrides
Policy enforcementNot includedPre-execution checksFull lifecycle governance
Budget limitsNot includedSupported via policiesBuilt-in with tenant/agent scoping
Durable workflowsNot includedNot includedCheckpoint/resume with WorkflowEnvelope
Approval workflowsNot includedNot includedBuilt-in with pause/resume
Multi-tenancyNot includedTenant-scoped via control planeNative tenant isolation
Audit loggingNot includedRun-level audit trailFull execution trace with agent_trace

Which Approach Should You Choose?

Start with Observe

If you already have CrewAI agents running, start with CrewAI + Observe. Wrap your crew.kickoff() calls with WaxellContext and you immediately get cost tracking, policy enforcement, and audit trails. Migration to native Waxell can happen later when you need durable workflows.

See the Progressive Migration guide for a phased approach to adopting Waxell.