Skip to main content

Weaviate Agent

A multi-agent semantic search pipeline using Weaviate v4. A parent orchestrator coordinates 3 child agents -- an indexer that adds objects to collections, a searcher that runs near_text semantic search, hybrid search (vector + keyword), and object fetch, then merges and ranks results, and a synthesizer that generates answers with quality assessment. The pipeline demonstrates SDK primitives across Weaviate-specific operations including near_text, hybrid search with alpha tuning, and UUID-based fetch.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls.

Architecture

Key Code

Weaviate Search Operations with @tool

The searcher runs three Weaviate operations: near_text for pure semantic search, hybrid for combined vector + keyword scoring, and fetch for UUID-based object retrieval.

@waxell.tool(tool_type="vector_db")
def weaviate_near_text(concepts: list, collection_name: str = "Document",
limit: int = 3) -> dict:
"""Run near_text semantic search on a Weaviate collection."""
return {"objects": [...], "total": len(results)}

@waxell.tool(tool_type="vector_db")
def weaviate_hybrid_search(query: str, collection_name: str = "Document",
alpha: float = 0.75, limit: int = 2) -> dict:
"""Run hybrid search (vector + keyword) on a Weaviate collection."""
return {"objects": [...], "total": len(results)}

@waxell.tool(tool_type="vector_db")
def weaviate_fetch_object(uuid: str, collection_name: str = "Document",
include_vector: bool = True) -> dict:
"""Fetch a specific object by UUID from Weaviate."""
return {"uuid": obj["uuid"], "properties": obj["properties"]}

Merge and Rank with @retrieval

Results from near_text and hybrid searches are merged, deduplicated by UUID, and sorted by score.

@waxell.retrieval(source="weaviate")
def rank_search_results(query: str, near_text_results: list,
hybrid_results: list) -> list[dict]:
"""Merge and rank results from near_text and hybrid searches."""
seen_uuids = set()
merged = []
for r in near_text_results:
if r["uuid"] not in seen_uuids:
seen_uuids.add(r["uuid"])
merged.append({
"uuid": r["uuid"], "title": r["properties"]["title"],
"content": r["properties"]["content"],
"score": r["certainty"], "source": "near_text",
})
for r in hybrid_results:
if r["uuid"] not in seen_uuids:
seen_uuids.add(r["uuid"])
merged.append({...})
merged.sort(key=lambda x: x["score"], reverse=True)
return merged

What this demonstrates

  • @waxell.observe -- parent-child agent hierarchy (orchestrator + 3 child agents) with automatic lineage via WaxellContext
  • @waxell.tool(tool_type="vector_db") -- Weaviate operations (add_objects, near_text, hybrid_search, fetch_object) recorded as tool spans
  • @waxell.retrieval(source="weaviate") -- merged near_text and hybrid result ranking recorded with Weaviate as the source
  • @waxell.decision -- search mode selection via OpenAI (near_text, hybrid, keyword) and output format
  • waxell.decide() -- manual collection routing and merge strategy decisions
  • @waxell.reasoning_dec -- chain-of-thought quality assessment of synthesized answers
  • @waxell.step_dec -- query preprocessing with concept extraction
  • waxell.score() -- answer quality and relevance scores attached to the trace
  • waxell.tag() / waxell.metadata() -- vector DB type, Weaviate version (v4), collection name, and agent role
  • Auto-instrumented LLM calls -- OpenAI calls captured without extra code
  • Three search modes -- near_text semantic search, hybrid vector + keyword, and UUID fetch in one pipeline

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.weaviate_agent --dry-run

# Live mode
export OPENAI_API_KEY="sk-..."
python -m app.demos.weaviate_agent

# Custom query
python -m app.demos.weaviate_agent --dry-run --query "Find deployment automation patterns"

Source

dev/waxell-dev/app/demos/weaviate_agent.py