LanceDB Agent
A multi-agent serverless vector search pipeline using LanceDB. A parent orchestrator coordinates 3 child agents -- an indexer that creates tables and adds vectors in Lance columnar format, a searcher that performs vector search and evaluates coverage, and a synthesizer that builds a synthesis prompt, generates an answer, and assesses quality. The pipeline demonstrates SDK primitives across LanceDB-specific operations including zero-copy table creation and distance-based search.
This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls.
Architecture
Key Code
LanceDB Table and Vector Operations with @tool
The indexer creates a table in Lance format and adds vectors. The searcher performs distance-based vector search.
@waxell.tool(tool_type="vector_db")
def create_table(table_name: str, num_rows: int,
storage_format: str = "lance") -> dict:
"""Create a LanceDB table with vectorized documents."""
return {"table": table_name, "rows": num_rows, "format": storage_format}
@waxell.tool(tool_type="vector_db")
def add_vectors(table_name: str, documents: list) -> dict:
"""Add vector embeddings to a LanceDB table."""
return {"table": table_name, "added": len(documents), "format": "lance"}
@waxell.tool(tool_type="vector_db")
def vector_search(table_name: str, query: str, limit: int = 3) -> dict:
"""Search a LanceDB table for similar vectors."""
return {"results": len(results), "top_distance": results[0]["distance"]}
Coverage Evaluation with @reasoning
The searcher evaluates how well results cover the query intent using category analysis.
@waxell.reasoning_dec(step="search_coverage_evaluation")
async def evaluate_search_coverage(results: list, query: str) -> dict:
categories = [r.get("metadata", {}).get("category", "unknown") for r in results]
avg_score = sum(r.get("score", 0) for r in results) / max(len(results), 1)
return {
"thought": f"LanceDB returned {len(results)} results with avg relevance {avg_score:.3f}. "
f"Categories covered: {', '.join(set(categories))}.",
"evidence": [f"{r['id']}: {r.get('metadata', {}).get('category', '?')} (score: {r.get('score', 0):.3f})"
for r in results],
"conclusion": "Good coverage" if avg_score > 0.5 else "Coverage may be insufficient",
}
What this demonstrates
@waxell.observe-- parent-child agent hierarchy (orchestrator + 3 child agents) with automatic lineage viaWaxellContext@waxell.tool(tool_type="vector_db")-- LanceDB operations (create table, add vectors, vector search) recorded as tool spans@waxell.retrieval(source="lancedb")-- distance-based result ranking recorded with LanceDB as the source@waxell.decision-- search strategy classification via OpenAI (semantic, hybrid, keyword)waxell.decide()-- manual table routing decision based on search strategy@waxell.reasoning_dec-- search coverage evaluation and answer quality assessment@waxell.step_dec-- query preprocessing and synthesis prompt constructionwaxell.score()-- answer quality and retrieval relevance scores attached to the tracewaxell.tag()/waxell.metadata()-- vector DB type, table name, storage format (Lance), and corpus size- Auto-instrumented LLM calls -- OpenAI calls captured without extra code
- Serverless vector search -- Lance columnar format with zero-copy operations
Run it
# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.lancedb_agent --dry-run
# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.lancedb_agent
# Custom query
python -m app.demos.lancedb_agent --dry-run --query "Find zero-copy vector patterns"