Lightweight Vector Engines Agent
A multi-agent comparison pipeline across 5 lightweight/in-process vector search engines (Annoy, hnswlib, USearch, ScaNN, DuckDB). A parent orchestrator coordinates 3 child agents -- a builder that constructs all indices, a searcher that queries all engines and collects results, and an evaluator that reasons about quality trade-offs and synthesizes a recommendation with an LLM.
Environment variables
This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls. All engine libraries are mocked for portability.
Architecture
Key Code
Index Construction and Search with @tool
The builder constructs 5 different indices. The searcher queries all 5 engines, each recorded as a separate tool span.
@waxell.tool(tool_type="vector_db")
def build_annoy_index(dim: int, n_docs: int, mock_vectors: list) -> dict:
"""Build an Annoy index from vectors."""
idx = MockAnnoyIndex(f=dim, metric="angular")
for i, vec in enumerate(mock_vectors):
idx.add_item(i, vec)
idx.build(10)
return {"index": idx, "n_trees": 10, "n_items": idx.get_n_items()}
@waxell.tool(tool_type="vector_db")
def search_hnsw(index, query_vector: list, k: int = 3) -> dict:
"""Search HNSW index for nearest neighbors."""
labels, distances = index.knn_query([query_vector], k=k)
return {"labels": labels[0], "distances": distances[0], "num_results": len(labels[0]), "latency_ms": 0}
@waxell.tool(tool_type="vector_db")
def search_duckdb(connection, k: int = 3) -> dict:
"""Search DuckDB using array_cosine_similarity SQL function."""
result = connection.execute(
"SELECT id, title, array_cosine_similarity(embedding, $1) AS distance "
"FROM documents ORDER BY distance DESC LIMIT 3"
)
return {"rows": result.fetchall(), "sql_function": "array_cosine_similarity", "latency_ms": 5}
Multi-Engine Collection and Evaluation
@waxell.retrieval(source="multi-engine")
def collect_search_results(engine_results: dict) -> list[dict]:
"""Collect and normalize search results from all engines."""
collected = [{"engine": name, "num_results": r["num_results"],
"latency_ms": r["latency_ms"]} for name, r in engine_results.items()]
collected.sort(key=lambda x: x["latency_ms"])
return collected
@waxell.reasoning_dec(step="engine_quality_assessment")
async def evaluate_engines(comparison: list[dict]) -> dict:
fastest = comparison[0]["engine"]
return {
"thought": f"Compared {len(comparison)} engines. Fastest: {fastest}.",
"evidence": [f"{c['engine']}: {c['num_results']} results in {c['latency_ms']}ms" for c in comparison],
"conclusion": f"{fastest} best latency; graph-based engines best recall-latency trade-off",
}
What this demonstrates
@waxell.observe-- parent-child agent hierarchy (orchestrator + 3 child agents) with automatic lineage viaWaxellContext@waxell.tool(tool_type="vector_db")-- 10 tool spans (5 build + 5 search) across Annoy, hnswlib, USearch, ScaNN, and DuckDB@waxell.retrieval(source="multi-engine")-- cross-engine result collection normalized by latency@waxell.decision-- engine family selection (all, tree-based, graph-based, sql-based)waxell.decide()-- engine recommendation with confidence score@waxell.reasoning_dec-- engine quality assessment comparing recall vs latency trade-offswaxell.score()-- comparison coverage and recommendation confidence scores- Auto-instrumented LLM calls -- OpenAI synthesis captured without extra code
- 5 in-process engines -- Annoy (trees), hnswlib (HNSW), USearch (HNSW), ScaNN (hybrid), DuckDB (SQL) compared
Run it
# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.lightweight_vector_agent --dry-run
# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.lightweight_vector_agent
# Custom query
python -m app.demos.lightweight_vector_agent --dry-run --query "Compare vector search engines"