Skip to main content

RAG Frameworks Agent

A multi-agent RAG frameworks comparison demo. A parent orchestrator coordinates 2 child agents -- a runner that executes queries across 5 RAG frameworks (GraphRAG, LightRAG, Pathway, RAGFlow, R2R), and an evaluator that compares results, assesses quality, and synthesizes a recommendation with an LLM. Each framework exercises its specific auto-instrumentor.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls. All framework clients are mocked.

Architecture

Key Code

Framework-Specific Query Steps

Each RAG framework is exercised through its specific API, with auto-instrumentors recording the calls.

# GraphRAG -- LocalSearch and GlobalSearch
@waxell.step_dec(name="graphrag_query")
async def graphrag_query(local_search, global_search, query: str) -> dict:
local_result = local_search.search(query)
global_result = global_search.search(query)
return {"local_entities": len(local_result.context_data), "global_response": global_result.response}

# LightRAG -- hybrid query with insert
@waxell.step_dec(name="lightrag_query")
async def lightrag_query(rag, query: str, documents: list) -> dict:
rag.insert(documents)
result = rag.query(query, param={"mode": "hybrid"})
return {"answer": result, "mode": "hybrid"}

# Pathway -- real-time vector store query
@waxell.step_dec(name="pathway_query")
async def pathway_query(vs_client, query: str) -> dict:
results = vs_client.query(query, k=5)
return {"results": len(results), "source": "pathway"}

# RAGFlow -- create dataset and chat
@waxell.step_dec(name="ragflow_chat")
async def ragflow_chat(client, query: str) -> dict:
dataset = client.create_dataset("demo")
response = client.create_chat(query, dataset_ids=[dataset["id"]])
return {"answer": response["answer"]}

# R2R -- search and ingest
@waxell.step_dec(name="r2r_search")
async def r2r_search(client, query: str) -> dict:
results = client.search(query, limit=5)
return {"results": len(results["results"]), "source": "r2r"}

Framework Evaluation with @reasoning

@waxell.reasoning_dec(step="evaluate_frameworks")
async def evaluate_frameworks(results: dict) -> dict:
frameworks = list(results.keys())
return {
"thought": f"Compared {len(frameworks)} RAG frameworks: {', '.join(frameworks)}.",
"evidence": [f"{fw}: {r.get('results', 'N/A')} results" for fw, r in results.items()],
"conclusion": "Each framework has distinct strengths for different RAG patterns",
}

What this demonstrates

  • @waxell.observe -- parent-child agent hierarchy (orchestrator + 2 child agents) with automatic lineage
  • @waxell.step_dec -- 5 framework query steps, each exercising its auto-instrumentor
  • @waxell.retrieval -- per-framework retrieval recording
  • @waxell.decision -- framework strategy selection and output format choice
  • waxell.decide() -- manual framework selection decision
  • @waxell.reasoning_dec -- framework comparison evaluation and quality assessment
  • Auto-instrumentors -- GraphRAG, LightRAG, Pathway, RAGFlow, R2R instrumentors
  • Auto-instrumented LLM calls -- OpenAI synthesis captured without extra code
  • 5 RAG frameworks -- GraphRAG (local+global), LightRAG (hybrid), Pathway (real-time), RAGFlow (chat), R2R (search) compared

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.rag_framework_agent --dry-run

# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.rag_framework_agent

# Custom query
python -m app.demos.rag_framework_agent --dry-run --query "How do knowledge graphs improve RAG?"

Source

dev/waxell-dev/app/demos/rag_framework_agent.py