Best Practices
This guide covers recommended patterns and practices for building production-ready Waxell agents.
Agent Design
Keep Agents Focused
Each agent should have a single, clear responsibility:
# Good: Single responsibility
@agent(name="email-classifier")
class EmailClassifier:
"""Classifies incoming emails."""
pass
@agent(name="email-responder")
class EmailResponder:
"""Generates email responses."""
pass
# Avoid: Multiple responsibilities
@agent(name="email-handler")
class EmailHandler:
"""Classifies, responds, and archives emails.""" # Too broad
pass
Use Capabilities for Reuse
Extract common patterns into capabilities:
@capability(name="classification")
class ClassificationCapability:
@decision
def classify(self, ctx, text: str, categories: list[str]):
return ctx.llm.classify(text, categories=categories)
@agent(
name="support-agent",
capabilities=[ClassificationCapability]
)
class SupportAgent:
pass
Workflow Design
Prefer Small Steps
Break workflows into small, checkpointable steps:
@workflow
def process_order(self, ctx):
# Good: Each step is independently checkpointable
validated = ctx.call(self.validate)
enriched = ctx.call(self.enrich, data=validated)
result = ctx.call(self.process, data=enriched)
return result
Handle Errors Gracefully
Implement error handling at each step:
@workflow
def resilient_workflow(self, ctx):
try:
result = ctx.call(self.risky_operation)
except ValidationError:
return ctx.call(self.handle_validation_error)
except ExternalServiceError:
return ctx.call(self.retry_with_backoff)
return result
Decision Design
Provide Clear Prompts
Be explicit about what you want from the LLM:
@decision
def classify_intent(self, ctx):
# Good: Clear, specific prompt
return ctx.llm.classify(
text=ctx.input.message,
categories=["billing", "technical", "general"],
instructions="Classify based on the primary topic discussed"
)
Use Structured Output
Prefer structured output for reliable parsing:
from pydantic import BaseModel
class Analysis(BaseModel):
sentiment: str
confidence: float
topics: list[str]
@decision
def analyze(self, ctx):
return ctx.llm.generate(
prompt=f"Analyze: {ctx.input.text}",
response_model=Analysis # Structured output
)
Governance
Set Appropriate Rate Limits
Protect your resources with rate limits:
@agent(
name="email-sender",
rate_limit={
"requests_per_minute": 10,
"tokens_per_minute": 50000
}
)
class EmailSender:
pass
Require Approval for Sensitive Operations
Add human oversight for risky actions:
@tool(requires_approval=True)
def delete_account(self, ctx, user_id: str):
"""Delete user account - requires human approval."""
pass
Testing
Test at Multiple Levels
- Unit tests: Test individual decisions and tools
- Integration tests: Test workflows end-to-end
- Simulation tests: Test with mock LLM responses
from waxell_infra.testing import AgentTestHarness
def test_classification_workflow():
harness = AgentTestHarness(ClassifierAgent)
result = harness.run(
workflow="classify",
input={"message": "I can't log in"},
mock_llm_responses={"classify": "technical"}
)
assert result["category"] == "technical"
Monitoring
Enable Observability
Use structured logging and tracing:
@workflow
def monitored_workflow(self, ctx):
ctx.log.info("Starting workflow", input_size=len(ctx.input))
result = ctx.call(self.process)
ctx.log.info("Workflow complete", result_status=result["status"])
return result
Set Up Alerts
Monitor key metrics:
- Execution success rate
- Latency percentiles
- Token consumption
- Error rates by type
Next Steps
- Production Guide - Deploy with these practices
- Governance Tutorial - Implement governance