Jina AI

A multi-agent Jina AI reranking pipeline with 4 agents: a parent orchestrator, a retriever for initial embedding retrieval, a reranker using jina-reranker-v2-base-multilingual, and a synthesizer for LLM answer generation. Demonstrates the two-stage retrieval pattern -- fast embedding recall followed by accurate cross-encoder reranking.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys. Jina operations run in simulated mode.

Architecture

Key Code

Two-stage retrieval: embedding recall then reranking

The retriever generates embeddings, then the reranker rescores candidates with a cross-encoder.

@waxell.tool(tool_type="embedding")
def embed_documents(texts: list[str], model: str = "jina-embeddings-v3") -> dict:
    return {"model": model, "count": len(texts), "dimensions": 768,
            "tokens": sum(len(t.split()) * 2 for t in texts)}

@waxell.tool(tool_type="reranking")
def rerank_documents(query: str, documents: list[dict],
                     model: str = "jina-reranker-v2-base-multilingual", top_n: int = 3) -> dict:
    scored = [{**doc, "rerank_score": MOCK_RERANK_SCORES[i]}
              for i, doc in enumerate(documents)]
    scored.sort(key=lambda d: d["rerank_score"], reverse=True)
    return {"model": model, "candidates": len(documents),
            "results": scored[:top_n], "top_score": scored[0]["rerank_score"]}

Reranker with threshold filtering and quality assessment

The reranker filters by score threshold and assesses whether reranking improved separation.

@waxell.step_dec(name="filter_by_threshold")
async def filter_by_threshold(documents: list[dict], threshold: float) -> dict:
    passed = [d for d in documents if d.get("rerank_score", 0) >= threshold]
    return {"threshold": threshold, "passed": len(passed),
            "filtered_out": len(documents) - len(passed), "documents": passed}

@waxell.reasoning_dec(step="rerank_quality_assessment")
async def assess_rerank_quality(reranked_docs: list[dict], threshold: float) -> dict:
    top_score = reranked_docs[0]["rerank_score"] if reranked_docs else 0
    score_spread = top_score - reranked_docs[-1]["rerank_score"] if reranked_docs else 0
    return {
        "thought": f"{len(reranked_docs)} results, spread={score_spread:.2f}.",
        "evidence": [f"Doc {d['id']}: score={d['rerank_score']:.2f}" for d in reranked_docs],
        "conclusion": "Clear quality separation" if score_spread > 0.1
                      else "Scores clustered -- reranking may not add value",
    }

What this demonstrates

@waxell.observe -- 4-agent hierarchy (orchestrator, retriever, reranker, synthesizer)
@waxell.step_dec -- query preprocessing and threshold filtering steps
@waxell.tool -- embedding with tool_type="embedding" and reranking with tool_type="reranking"
@waxell.retrieval -- initial and reranked retrieval with source="jina"
@waxell.decision -- rerank model selection (v2 multilingual, v1, cross-encoder)
@waxell.reasoning_dec -- rerank quality and answer quality assessments
waxell.score() -- answer quality and rerank precision scores
Two-stage retrieval -- fast embedding recall followed by cross-encoder reranking
Threshold filtering -- configurable relevance threshold to prune low-quality results

Run it

# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.jina_agent --dry-run

# Live (real Jina + OpenAI)
export OPENAI_API_KEY="sk-..."
python -m app.demos.jina_agent

Source

dev/waxell-dev/app/demos/jina_agent.py

Architecture​

Key Code​

Two-stage retrieval: embedding recall then reranking​

Reranker with threshold filtering and quality assessment​

What this demonstrates​

Run it​

Source​