Skip to main content

Nomic

A multi-agent Nomic AI embedding pipeline with a parent orchestrator coordinating 2 child agents -- an embedder and an analyzer. Demonstrates nomic-embed-text-v1.5 with task-type hints (search_query / search_document), Matryoshka dimensionality reduction strategies, and mock-only mode (no NOMIC_API_KEY required).

Environment variables

This example requires OPENAI_API_KEY (for LLM synthesis), WAXELL_API_KEY, and WAXELL_API_URL. Use --dry-run to run without any API keys. Nomic embeddings run in mock mode.

Architecture

Key Code

Nomic embeddings with task-type hints

Nomic models use different task types for queries vs documents to improve retrieval quality.

@waxell.tool(tool_type="embedding")
def embed_query_nomic(query: str, model: str = "nomic-embed-text-v1.5",
task_type: str = "search_query") -> dict:
"""Embed a query using Nomic embeddings."""
dim = 768
tokens = len(query.split()) * 2
return {"model": model, "task_type": task_type, "dimensions": dim, "tokens": tokens}

@waxell.tool(tool_type="embedding")
def embed_documents_nomic(texts: list[str], model: str = "nomic-embed-text-v1.5",
task_type: str = "search_document") -> dict:
"""Embed documents using Nomic embeddings."""
tokens = sum(len(t.split()) * 2 for t in texts)
return {"model": model, "task_type": task_type, "count": len(texts), "tokens": tokens}

Strategy selection with Matryoshka dimensionality reduction

The orchestrator selects between full 768d, reduced 256d, or Matryoshka 128d embeddings.

@waxell.decision(name="select_embedding_strategy",
options=["full_768", "reduced_256", "matryoshka_128"])
async def select_embedding_strategy(query: str, corpus_size: int) -> dict:
if corpus_size > 100:
return {"chosen": "reduced_256",
"reasoning": f"Large corpus ({corpus_size}) -- reduced dims for speed"}
elif "matryoshka" in query.lower():
return {"chosen": "matryoshka_128",
"reasoning": "Query mentions dimensionality -- demonstrate Matryoshka"}
return {"chosen": "full_768",
"reasoning": f"Standard corpus ({corpus_size}) -- full 768d for best quality"}

What this demonstrates

  • @waxell.observe -- parent-child agent hierarchy with automatic lineage
  • @waxell.step_dec -- query preprocessing and similarity computation steps
  • @waxell.tool -- Nomic embedding generation with tool_type="embedding"
  • @waxell.retrieval -- similarity-based retrieval with source="nomic"
  • @waxell.decision -- dimensionality strategy and output format selection
  • @waxell.reasoning_dec -- embedding quality analysis
  • waxell.score() -- embedding quality and answer relevance scores
  • Task-type hints -- search_query vs search_document for Nomic models
  • Matryoshka training -- dimensionality reduction strategies (768, 256, 128)

Run it

# Dry-run (no API keys needed)
cd dev/waxell-dev
python -m app.demos.nomic_agent --dry-run

# Live (real OpenAI for synthesis)
export OPENAI_API_KEY="sk-..."
python -m app.demos.nomic_agent

Source

dev/waxell-dev/app/demos/nomic_agent.py