Nomic
A multi-agent Nomic AI embedding pipeline with a parent orchestrator coordinating 2 child agents -- an embedder and an analyzer. Demonstrates nomic-embed-text-v1.5 with task-type hints (search_query / search_document), Matryoshka dimensionality reduction strategies, and mock-only mode (no NOMIC_API_KEY required).
Environment variables
This example requires OPENAI_API_KEY (for LLM synthesis), WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys. Nomic embeddings run in mock mode.
Architecture
Key Code
Nomic embeddings with task-type hints
Nomic models use different task types for queries vs documents to improve retrieval quality.
@waxell.tool(tool_type="embedding")
def embed_query_nomic(query: str, model: str = "nomic-embed-text-v1.5",
task_type: str = "search_query") -> dict:
"""Embed a query using Nomic embeddings."""
dim = 768
tokens = len(query.split()) * 2
return {"model": model, "task_type": task_type, "dimensions": dim, "tokens": tokens}
@waxell.tool(tool_type="embedding")
def embed_documents_nomic(texts: list[str], model: str = "nomic-embed-text-v1.5",
task_type: str = "search_document") -> dict:
"""Embed documents using Nomic embeddings."""
tokens = sum(len(t.split()) * 2 for t in texts)
return {"model": model, "task_type": task_type, "count": len(texts), "tokens": tokens}
Strategy selection with Matryoshka dimensionality reduction
The orchestrator selects between full 768d, reduced 256d, or Matryoshka 128d embeddings.
@waxell.decision(name="select_embedding_strategy",
options=["full_768", "reduced_256", "matryoshka_128"])
async def select_embedding_strategy(query: str, corpus_size: int) -> dict:
if corpus_size > 100:
return {"chosen": "reduced_256",
"reasoning": f"Large corpus ({corpus_size}) -- reduced dims for speed"}
elif "matryoshka" in query.lower():
return {"chosen": "matryoshka_128",
"reasoning": "Query mentions dimensionality -- demonstrate Matryoshka"}
return {"chosen": "full_768",
"reasoning": f"Standard corpus ({corpus_size}) -- full 768d for best quality"}
What this demonstrates
@waxell.observe-- parent-child agent hierarchy with automatic lineage@waxell.step_dec-- query preprocessing and similarity computation steps@waxell.tool-- Nomic embedding generation withtool_type="embedding"@waxell.retrieval-- similarity-based retrieval withsource="nomic"@waxell.decision-- dimensionality strategy and output format selection@waxell.reasoning_dec-- embedding quality analysiswaxell.score()-- embedding quality and answer relevance scores- Task-type hints --
search_queryvssearch_documentfor Nomic models - Matryoshka training -- dimensionality reduction strategies (768, 256, 128)
Run it
# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.nomic_agent --dry-run
# Live (real OpenAI for synthesis)
export OPENAI_API_KEY="sk-..."
python -m app.demos.nomic_agent