Agent Examples
Browse 108 demo agents demonstrating waxell-observe SDK patterns across LLM providers, vector databases, agent frameworks, and specialized pipelines. Each agent runs in dry-run mode by default and produces a complete observability trace.
LLM Providers
OpenAI Agents SDK
intermediateRunner, triage, and handoff patterns with OpenAI Agents SDK.
Anthropic
beginnerMulti-step content analysis pipeline with Claude models.
Gemini
beginnerMulti-agent multi-model pipeline with Google Gemini API.
Groq
intermediateFunction calling with fast inference using Groq and OpenAI.
Mistral
beginnerMulti-model pipeline with Mistral chat completion API.
Cohere
beginnerMulti-model classify + generate pipeline with Cohere V2 API.
Together AI
beginnerMulti-model inference with Together AI API.
HuggingFace
beginnerInference API integration with HuggingFace text generation.
AI21 Labs
beginnerJamba multi-model inference with AI21 Labs.
Azure OpenAI
beginnerAzure-hosted OpenAI models with decorator-based observability.
AWS Bedrock
beginnerBedrock model invocation with Converse API and Nova models.
AWS Bedrock Agents
intermediateBedrock Agents orchestration with action groups and knowledge base retrieval.
Vertex AI
beginnerVertex AI model pipeline with generate and chat modes.
Meta Llama
intermediateLlama ecosystem integration with Meta Llama and Llama Stack.
LiteLLM
beginnerMulti-provider proxy through LiteLLM unified API.
Ollama
beginnerLocal inference with Ollama running Llama 3.2.
All Providers
intermediateAll providers in one trace: OpenAI, Anthropic, and LiteLLM.
Cloud LLM Providers
intermediateCloud LLM providers comparison: DashScope, WatsonX, Azure AI.
Vector Databases
FAISS RAG
advancedGold-standard multi-agent RAG with FAISS, 3 LLM providers, and 5 child agents.
ChromaDB
intermediateMulti-agent document search pipeline with ChromaDB vector operations.
Pinecone
intermediateMulti-agent vector database pipeline with Pinecone.
Qdrant
intermediateMulti-agent vector database pipeline with Qdrant.
Weaviate
intermediateSemantic search pipeline with Weaviate v4.
Milvus
intermediateMulti-agent vector search pipeline with Milvus/Zilliz.
pgvector
beginnerPostgreSQL vector search with pgvector extension.
Redis Vector
beginnerRedis vector search with HNSW index and KNN search.
LanceDB
intermediateServerless vector search pipeline with LanceDB.
MongoDB Vector
beginnerMongoDB vector search with cosine similarity aggregation.
Elasticsearch
beginnerElasticsearch knn + hybrid search with dense_vector mapping.
Neo4j
beginnerGraph DB + vector search with Cypher and Neo4j.
Cloud Vector
advancedCloud vector platforms comparison: Turbopuffer, Vespa, Marqo, Cassandra, OpenSearch.
Managed Vector
intermediateManaged vector DB comparison: Supabase, SingleStore, Vectara.
Lightweight Vector
advancedLightweight vector search comparison: Annoy, hnswlib, USearch, ScaNN, DuckDB.
Agent Frameworks
LangChain
intermediateMulti-agent pipeline with LangChain chains and auto-instrumented LLM.
LangGraph
intermediateStateful graph with conditional edges using LangGraph.
LlamaIndex
intermediateMulti-agent RAG pipeline with LlamaIndex.
CrewAI
intermediateMulti-agent crew execution with researcher and writer agents.
AutoGen
intermediateMulti-agent group chat conversation with AutoGen.
Haystack
intermediateRAG pipeline with Haystack components.
DSPy
intermediateModule execution and optimization with DSPy.
Semantic Kernel
intermediateMulti-agent orchestration with Semantic Kernel plugins.
PydanticAI
intermediateMulti-agent pipeline with type safety using PydanticAI.
smolagents
intermediateLightweight multi-agent with HuggingFace smolagents.
Strands Agents
intermediateMulti-agent orchestration with AWS Strands.
Agno
intermediateFramework with tool use and reasoning using Agno.
Letta (MemGPT)
intermediateStateful agent with long-term memory management using Letta.
Google ADK
intermediateAgent Development Kit multi-agent with sub-agents and tools.
Claude Agents
intermediateMulti-agent with Claude and Anthropic tool use.
RAG & Retrieval
RAG Pipeline
intermediateMulti-agent RAG pipeline with retriever and synthesizer.
Full RAG Pipeline
advancedFull RAG stress test: 12-step pipeline across scrape, embed, index, query, rerank, eval.
Knowledge Graph RAG
advancedGraph + vector hybrid retrieval stress test with 12-step pipeline.
RAG Frameworks
advancedRAG frameworks comparison: GraphRAG, LightRAG, Pathway, RAGFlow, R2R.
Cohere Rerank
intermediateCohere Embed + Rerank RAG pipeline with multi-agent lineage.
Voyage Rerank
intermediateVoyage AI Reranker RAG pipeline with token tracking.
Reranker Comparison
intermediateReranking strategies comparison: Cross-encoder, Pinecone, FlashRank, ColBERT.
Embeddings
OpenAI Embeddings
beginnerMulti-agent OpenAI Embeddings pipeline with batch processing and similarity.
Sentence Transformers
beginnerLocal embedding with sentence-transformers, zero-cost attribution.
FastEmbed
beginnerLocal embedding with FastEmbed ONNX-based inference.
Nomic AI
intermediateMulti-agent embedding pipeline with Nomic AI.
Voyage AI
intermediateMulti-agent embedding pipeline with Voyage AI and cost tracking.
Jina AI
intermediateMulti-agent reranking pipeline with Jina AI.
Embedding Models
advancedEmbedding model comparison across 6 providers: BGE, E5, Instructor, TEI, Mixedbread, Transformers.
Safety & Governance
Governance
intermediateGovernance and policy deep dive with record_events, check_policy, and sync wrappers.
Guardrails AI
intermediateMulti-agent guardrails validation with Guardrails AI.
LLM Guard
intermediateMulti-agent LLM Guard pipeline with input and output scanners.
NeMo Guardrails
intermediateNVIDIA NeMo Guardrails with Colang-based topical and safety rails.
Prompt Guard
intermediatePrompt guard showcase: block, warn, and redact modes for PII and injection.
Safety Guardrails
advancedSafety and content moderation comparison: Lakera Guard, Presidio, PolyGuard, Azure Content Safety.
Safety Gauntlet
advancedSafety gauntlet stress test: 5 input + 3 output safety systems in a 12-step pipeline.
OpenAI Moderation
beginnerOpenAI Moderation API integration with per-category flagging.
Evaluation
DeepEval
intermediateEvaluation with DeepEval metrics: AnswerRelevancy, Faithfulness.
RAGAS
intermediateRAG evaluation with RAGAS metrics: faithfulness, answer_relevancy.
Eval Battery
advancedEvaluation battery stress test: 6 frameworks, 24 metrics, aggregate verdict.
Eval Frameworks
advancedLLM evaluation framework comparison: Braintrust, TruLens, Giskard, Inspect AI, PromptFoo.
Multi-Agent Patterns
Multi-Agent Coordinator
intermediateCoordinated agents with shared session: planner, researcher, executor.
Multi-Agent Coordination
advancedCoordination stress test: CrewAI + AutoGen + Agno with shared Zep memory.
Multi-Agent Swarm
advancedCollaboration frameworks comparison: Agency Swarm, SuperAGI, CAMEL.
Workflow Agents
advancedWorkflow-oriented frameworks comparison: Julep, Langroid, ControlFlow.
Multi-Provider Shootout
advancedMulti-provider shootout stress test: 6 LLMs, 18 eval scores, rerank, winner.
Specialized Pipelines
Streaming
beginnerStreaming capture comparison: OpenAI vs Anthropic.
Tool Use
intermediateTool use and inter-agent communication: Computer Use, A2A, Composio.
Code Review
intermediateCode review agent pipeline with static analysis and Anthropic.
Code Sandbox
intermediateSandboxed code execution with E2B Code Interpreter.
Research
intermediateMulti-agent research pipeline with agentic behavior tracking.
Customer Support
intermediateCustomer support agent pipeline: classify, lookup, route, respond.
Data Ingestion
advancedData ingestion pipeline stress test: scrape, embed, index, query, rerank, eval across 14 steps.
Enrichment
beginnerSDK enrichment showcase: scores, tags, metadata across multi-agent pipeline.
Prompt Management
intermediatePrompt retrieval, rendering, background collector, and capture_content mode.
Sync Pipeline
beginnerBatch ticket processing pipeline: classify, extract, route, respond.
MCP
intermediateMCP tool-calling integration with filesystem and search tools.
Web Scraping
intermediateAI-powered web scraping comparison: Crawl4AI, ScrapeGraphAI, Firecrawl.
Voice & Speech
Speech-to-Text
intermediateMulti-provider STT pipeline: Google Cloud, Azure, AWS, Faster Whisper, whisper.cpp, Deepgram, AssemblyAI.
Text-to-Speech
intermediateMulti-provider TTS pipeline: Google Cloud, Azure, AWS Polly, Cartesia, Coqui, ElevenLabs, PlayHT.
Voice AI
intermediateVoice AI agent frameworks: LiveKit Agents and Pipecat.
Voice Memory
advancedVoice-first AI with long-term memory: STT, memory, graph, LLM, TTS in 12-step pipeline.
Voice Platforms
intermediateManaged voice AI platforms comparison: Vapi and Retell.
Structured Generation & Inference
Structured Generation
intermediateStructured generation frameworks: Outlines, Guidance, LMQL.
Instructor
intermediateStructured extraction with Instructor and Pydantic models.
Image Generation
intermediateImage generation model comparison: Stable Diffusion, Flux, Fal AI.
BentoML
intermediateModel serving pipeline with BentoML runners.
Inference Servers
advancedProduction inference servers: SGLang, TGI, TensorRT-LLM, Triton.
Local Inference
advancedLocal inference engines comparison: llama.cpp, llamafile, LocalAI, ExLlamaV2.
vLLM
beginnervLLM local inference with PagedAttention optimization tracking.
LLM Wrappers
intermediateLLM wrapper libraries: Mirascope, Magentic, Marvin.