pgvector Agent

A single-agent pipeline using pgvector for PostgreSQL-native vector search. The agent creates a table with a vector column, inserts documents with real sentence-transformer embeddings, chooses a distance metric, performs cosine similarity search with the <=> operator, evaluates precision, and synthesizes an answer with an LLM. Demonstrates SDK primitives across real PostgreSQL + pgvector operations.

Environment variables

This example requires OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL. Requires PostgreSQL + pgvector on localhost:5433. Use --dry-run to skip real API calls.

Architecture

Key Code

Schema Setup and Similarity Search with `@tool`

The agent creates a pgvector-enabled table and performs cosine similarity search using PostgreSQL's <=> operator.

@waxell.tool(tool_type="database", name="pgvector_setup_schema")
def setup_schema(conn, dim: int) -> dict:
    """Create pgvector extension and table."""
    with conn.cursor() as cur:
        cur.execute("CREATE EXTENSION IF NOT EXISTS vector")
        cur.execute(f"""
            CREATE TABLE {TABLE_NAME} (
                id SERIAL PRIMARY KEY, title TEXT NOT NULL,
                text TEXT NOT NULL, embedding vector({dim})
            )
        """)
    conn.commit()
    return {"table": TABLE_NAME, "dimensions": dim}

@waxell.tool(tool_type="vector_db", name="pgvector_similarity_search")
def similarity_search(conn, query_embedding, operator: str = "<=>",
                      limit: int = 3) -> dict:
    """Vector similarity search using pgvector operators."""
    with conn.cursor() as cur:
        cur.execute(
            f"SELECT id, title, text, embedding {operator} %s::vector AS distance "
            f"FROM {TABLE_NAME} ORDER BY embedding {operator} %s::vector LIMIT %s",
            (query_embedding.tolist(), query_embedding.tolist(), limit),
        )
        rows = cur.fetchall()
    return {"hits": [{"id": str(r[0]), "title": r[1], "score": round(1.0 - float(r[3]), 4)} for r in rows]}

Distance Metric Decision and Precision Evaluation

@waxell.decision(name="choose_distance_metric", options=["cosine", "l2", "inner_product"])
def choose_distance_metric(query: str) -> dict:
    """Choose distance metric based on use case."""
    if any(w in query.lower().split() for w in ["exact", "precise"]):
        return {"chosen": "l2", "reasoning": "Query asks for precise matching"}
    return {"chosen": "cosine", "reasoning": "Default to cosine for semantic similarity"}

@waxell.reasoning_dec(step="evaluate_search_precision")
def evaluate_search_precision(results: list[dict]) -> dict:
    scores = [r["score"] for r in results]
    avg = sum(scores) / len(scores) if scores else 0
    return {
        "thought": f"Evaluating {len(results)} pgvector results",
        "evidence": [f"Average similarity: {avg:.3f}", f"Score spread: {max(scores) - min(scores):.3f}"],
        "conclusion": "Good precision" if avg > 0.6 else "Moderate precision",
    }

What this demonstrates

@waxell.observe -- single agent with full lifecycle tracing
@waxell.tool(tool_type="database") -- PostgreSQL DDL operations (schema setup, cleanup) recorded as database tool spans
@waxell.tool(tool_type="vector_db") -- pgvector similarity search with <=> operator recorded as vector DB tool span
@waxell.retrieval(source="pgvector") -- result extraction recorded with pgvector as the source
@waxell.decision -- distance metric selection (cosine <=>, L2 <->, inner product <#>)
@waxell.reasoning_dec -- precision evaluation with average score and spread analysis
waxell.score() -- search precision and relevance scores attached to the trace
waxell.tag() / waxell.metadata() -- vector DB type, database host, and distance metric
Auto-instrumented LLM calls -- OpenAI synthesis captured without extra code
Real pgvector operations -- actual PostgreSQL queries with sentence-transformers embeddings

Run it

# Dry-run mode (no API key needed)
cd dev/waxell-dev
python -m app.demos.pgvector_agent --dry-run

# Live mode (requires PostgreSQL + pgvector on localhost:5433)
export OPENAI_API_KEY="sk-..."
python -m app.demos.pgvector_agent

# Custom query
python -m app.demos.pgvector_agent --dry-run --query "Find AI governance best practices"

Source

dev/waxell-dev/app/demos/pgvector_agent.py

Architecture​

Key Code​

Schema Setup and Similarity Search with @tool​

Distance Metric Decision and Precision Evaluation​

What this demonstrates​

Run it​

Source​