Cloud Vector Platforms Agent

A multi-agent comparison pipeline across 5 cloud/distributed vector search platforms (Turbopuffer, Vespa, Marqo, Cassandra, OpenSearch). A parent orchestrator coordinates 2 child agents -- a runner that executes upsert + query operations across all platforms and ranks results by latency, and an evaluator that reasons about platform trade-offs, recommends a platform, and synthesizes a comparison with an LLM.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls. All platform clients are mocked for portability.

Architecture

Key Code

Multi-Platform Operations with `@tool`

The runner child agent executes upsert + query across all 5 cloud platforms, each recorded as a separate tool span.

@waxell.tool(tool_type="vector_db")
def turbopuffer_upsert(tpuf_ns, documents: list, vectors: list):
    """Upsert vectors into Turbopuffer namespace."""
    result = tpuf_ns.upsert(ids=[d["id"] for d in documents], vectors=vectors)
    return {"upserted": result.get("upserted", len(documents))}

@waxell.tool(tool_type="vector_db")
def vespa_query(vespa_app, yql: str):
    """Run a Vespa hybrid query with nearestNeighbor."""
    response = vespa_app.query(yql=yql)
    return {"hits": len(response.hits), "top_relevance": response.hits[0]["relevance"]}

@waxell.tool(tool_type="vector_db")
def cassandra_ann_search(cass_session):
    """Run ANN OF similarity search on Cassandra vector column."""
    result = cass_session.execute("SELECT ... ORDER BY embedding ANN OF ? LIMIT 3")
    return {"rows": len(result), "top_score": result.current_rows[0]["similarity_score"]}

@waxell.tool(tool_type="vector_db")
def opensearch_knn_search(os_client, query_vector: list, k: int = 3):
    """Run kNN vector search on OpenSearch."""
    response = os_client.search(body={"knn": {"embedding": {"vector": query_vector, "k": k}}})
    return {"hits": len(response["hits"]["hits"]), "took_ms": response["took"]}

Platform Ranking with `@retrieval` and Evaluation with `@reasoning`

@waxell.retrieval(source="cloud-vector-platforms")
def rank_platform_results(platform_results: dict) -> list[dict]:
    """Rank results across all cloud vector platforms by latency."""
    ranked = [{"platform": name, "results": r["results"], "latency_ms": r["latency_ms"],
               "top_score": r["top_score"]} for name, r in platform_results.items()]
    ranked.sort(key=lambda x: x["latency_ms"])
    return ranked

@waxell.reasoning_dec(step="platform_evaluation")
async def evaluate_platforms(comparison: list, query: str) -> dict:
    fastest = comparison[0]["platform"]
    best_quality = max(comparison, key=lambda x: x["top_score"])["platform"]
    return {
        "thought": f"Evaluated {len(comparison)} platforms. Fastest: {fastest}, best quality: {best_quality}.",
        "evidence": [f"{c['platform']}: {c['latency_ms']}ms, score={c['top_score']:.2f}" for c in comparison],
        "conclusion": f"{fastest} lowest latency; {best_quality} best relevance for production RAG.",
    }

What this demonstrates

@waxell.observe -- parent-child agent hierarchy (orchestrator + 2 child agents) with automatic lineage via WaxellContext
@waxell.tool(tool_type="vector_db") -- 10 tool spans across 5 platforms (upsert + query each for Turbopuffer, Vespa, Marqo, Cassandra, OpenSearch)
@waxell.retrieval(source="cloud-vector-platforms") -- cross-platform result ranking by latency
@waxell.decision -- workload classification via OpenAI (latency_sensitive, throughput_heavy, hybrid, analytical)
waxell.decide() -- evaluation strategy and platform recommendation decisions
@waxell.reasoning_dec -- platform trade-off evaluation across latency, relevance, and scalability
@waxell.step_dec -- query preprocessing
waxell.score() -- query coverage and recommendation confidence scores
Auto-instrumented LLM calls -- OpenAI synthesis captured without extra code
5 cloud platforms -- Turbopuffer, Vespa, Marqo, Cassandra, OpenSearch compared in one pipeline

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.cloud_vector_agent --dry-run

# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.cloud_vector_agent

# Custom query
python -m app.demos.cloud_vector_agent --dry-run --query "Best cloud vector platform for RAG"

Source

dev/waxell-dev/app/demos/cloud_vector_agent.py

Architecture​

Key Code​

Multi-Platform Operations with @tool​

Platform Ranking with @retrieval and Evaluation with @reasoning​

What this demonstrates​

Run it​

Source​