Voyage Rerank Agent

A multi-agent RAG pipeline with Voyage AI reranking and token tracking. A parent orchestrator coordinates 3 child agents -- a retriever that performs dense retrieval and ranks initial candidates, a reranker that rescores results using Voyage AI's rerank-2 model with token usage tracking and applies top-k filtering, and a synthesizer that generates answers with quality assessment.

Environment variables

This example requires OPENAI_API_KEY, VOYAGE_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to skip real API calls.

Architecture

Key Code

Dense Retrieval and Voyage Reranking

The retriever performs initial dense vector retrieval. The reranker rescores using Voyage AI's cross-encoder model with token tracking.

@waxell.tool(tool_type="vector_db")
def dense_retrieval(query_embedding: list, documents: list, k: int = 8) -> dict:
    """Retrieve top-k candidates by dense vector similarity."""
    return {"candidates": ranked_candidates, "total_considered": len(documents)}

@waxell.tool(tool_type="reranking")
def voyage_rerank(query: str, documents: list,
                  model: str = "rerank-2", top_n: int = 5) -> dict:
    """Rerank documents using Voyage AI rerank-2 model."""
    return {
        "reranked": reranked_docs,
        "top_score": reranked_docs[0]["relevance_score"],
        "model": model,
        "tokens": token_count,
    }

Top-K Filtering and Score Threshold Selection

@waxell.step_dec(name="filter_top_k")
async def filter_top_k(reranked: list, threshold: float = 0.7) -> dict:
    """Filter reranked results by score threshold."""
    filtered = [d for d in reranked if d["relevance_score"] >= threshold]
    return {"filtered": filtered, "removed": len(reranked) - len(filtered)}

@waxell.decision(name="select_score_threshold", options=["strict_0.8", "moderate_0.7", "lenient_0.5"])
def select_score_threshold(num_candidates: int) -> dict:
    """Choose score threshold based on candidate count."""
    if num_candidates > 5:
        return {"chosen": "strict_0.8", "reasoning": "Many candidates, apply strict filtering"}
    return {"chosen": "moderate_0.7", "reasoning": "Few candidates, moderate filtering"}

@waxell.retrieval(source="voyage")
def collect_reranked_results(filtered_docs: list) -> list[dict]:
    return [{"text": d["text"], "score": d["relevance_score"]} for d in filtered_docs]

What this demonstrates

@waxell.observe -- parent-child agent hierarchy (orchestrator + 3 child agents) with automatic lineage
@waxell.tool(tool_type="vector_db") -- dense retrieval recorded as vector DB tool span
@waxell.tool(tool_type="reranking") -- Voyage AI rerank-2 call with relevance rescoring and token tracking
@waxell.retrieval(source="voyage") -- initial ranking and reranked result collection
@waxell.decision -- retrieval strategy and score threshold selection
@waxell.reasoning_dec -- answer quality assessment
@waxell.step_dec -- query preprocessing and top-k filtering
Token tracking -- Voyage rerank token count recorded in tool output
Auto-instrumented LLM calls -- OpenAI synthesis captured without extra code
Multi-language reranking -- Voyage rerank-2 supports multi-language document reranking

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.voyage_rerank_agent --dry-run

# Live mode
export VOYAGE_API_KEY="..."
export OPENAI_API_KEY="sk-..."
python -m app.demos.voyage_rerank_agent

# Custom query
python -m app.demos.voyage_rerank_agent --dry-run --query "Best practices for RAG retrieval"

Source

dev/waxell-dev/app/demos/voyage_rerank_agent.py

Architecture​

Key Code​

Dense Retrieval and Voyage Reranking​

Top-K Filtering and Score Threshold Selection​

What this demonstrates​

Run it​

Source​