Retrieval Policy

The retrieval policy category governs the quality of documents retrieved in RAG pipelines. It checks relevance scores, source freshness, allowed collections, and source diversity — blocking or warning before stale or off-topic context contaminates your agent's answers.

Use it when you need to enforce quality standards on vector search, document retrieval, or any step where your agent fetches external context before generating a response.

Rules

Rule	Type	Default	Description
`min_relevance_score`	number	`0.7`	Minimum cosine/dot-product similarity score required (0.0–1.0)
`max_source_age_days`	integer	`90`	Maximum age of a retrieved source document in days
`min_chunks`	integer	`1`	Minimum number of results required
`max_chunks`	integer	`10`	Maximum number of results allowed
`allowed_collections`	string[]	`[]`	If non-empty, only these vector collections may be queried
`blocked_sources`	string[]	`[]`	Source identifiers that are always blocked regardless of score
`require_source_diversity`	boolean	`false`	Enforce that no single source dominates the result set
`max_single_source_ratio`	number	`0.6`	Maximum fraction of results from a single source (when diversity is required)
`action_on_low_relevance`	string	`"warn"`	What to do when relevance is below threshold: `"warn"` or `"block"`
`action_on_stale_source`	string	`"block"`	What to do when a source exceeds max age: `"warn"` or `"block"`
`action_on_chunk_violation`	string	`"warn"`	What to do when chunk count exceeds `min_chunks`/`max_chunks`: `"warn"` or `"block"`

How It Works

The retrieval handler runs at two phases:

mid_execution — checks each retrieved document as it is recorded. Fires after every ctx.record_retrieval_result() call.
after_workflow — checks the full result set for source diversity violations after the agent finishes.

Evaluation Order (mid_execution)

Check chunk count: too many {max_chunks} or too few {min_chunks}
For each result in retrieval_results:
- Relevance score below min_relevance_score → action_on_low_relevance
- Source in blocked_sources → always BLOCK
- Collection not in allowed_collections (if list is non-empty) → always BLOCK
- Source age exceeds max_source_age_days → action_on_stale_source

Evaluation Order (after_workflow)

If require_source_diversity is true, count how many chunks come from each source
If any single source accounts for more than max_single_source_ratio of results → WARN

Rule Matching Reference

Check	Condition	Result
Relevance 0.85, threshold 0.70	0.85 ≥ 0.70	Pass
Relevance 0.60, threshold 0.70	0.60 < 0.70	`action_on_low_relevance`
Source age 45d, max 90d	45 ≤ 90	Pass
Source age 200d, max 90d	200 > 90	`action_on_stale_source`
Collection "knowledge_base", allowed ["knowledge_base"]	In list	Pass
Collection "internal-hr", allowed ["knowledge_base"]	Not in list	BLOCK
Source "deprecated-kb.pdf", blocked list includes it	Exact match	BLOCK
3 of 4 chunks from same source, max ratio 0.6	0.75 > 0.6	WARN (after_workflow)

Chunk Count Action Is Configurable

By default, min_chunks and max_chunks violations produce WARN. Set action_on_chunk_violation: "block" to hard-stop agents that retrieve too many or too few documents. This is useful in regulated environments where retrieval pipeline health must be enforced.

allowed_collections Is Checked Per Document

If allowed_collections is ["knowledge_base"] and you retrieve from both "knowledge_base" and "internal-hr", the first result from "knowledge_base" passes, but the result from "internal-hr" triggers a BLOCK. The check fires per document, not per query.

Example Policies

Standard RAG Quality Gate

Enforce minimum relevance and freshness for a customer-facing knowledge base:

{
  "min_relevance_score": 0.75,
  "max_source_age_days": 90,
  "min_chunks": 1,
  "max_chunks": 8,
  "allowed_collections": ["knowledge_base", "product-docs"],
  "action_on_low_relevance": "warn",
  "action_on_stale_source": "block"
}

Strict Compliance Retrieval

Block on any quality violation — suitable for regulated environments where stale or low-quality sources cannot be used:

{
  "min_relevance_score": 0.80,
  "max_source_age_days": 30,
  "min_chunks": 2,
  "max_chunks": 5,
  "allowed_collections": ["compliance-docs"],
  "blocked_sources": ["deprecated-policy-archive"],
  "require_source_diversity": true,
  "max_single_source_ratio": 0.5,
  "action_on_low_relevance": "block",
  "action_on_stale_source": "block",
  "action_on_chunk_violation": "block"
}

Lenient Monitoring

Warn on all quality issues without blocking — useful for observing retrieval quality before enforcing stricter rules:

{
  "min_relevance_score": 0.5,
  "max_source_age_days": 365,
  "action_on_low_relevance": "warn",
  "action_on_stale_source": "warn"
}

SDK Integration

Recording Retrieval Results

Call ctx.record_retrieval_result() once per retrieved document. Each call automatically sends data to the controlplane and triggers mid_execution governance immediately.

import waxell_observe as waxell
from waxell_observe.errors import PolicyViolationError

async with waxell.WaxellContext(
    agent_name="rag-agent",
    enforce_policy=True,
) as ctx:
    # Record each retrieved document
    ctx.record_retrieval_result(
        relevance_score=0.92,
        source="product-manual.pdf",
        collection="knowledge_base",
        age_days=14,
    )
    ctx.record_retrieval_result(
        relevance_score=0.85,
        source="faq.pdf",
        collection="knowledge_base",
        age_days=30,
    )
    # Governance fires here — low relevance or stale sources raise PolicyViolationError

    answer = synthesize_answer(context_docs, query)
    ctx.set_result({"answer": answer})

Catching Policy Violations

try:
    async with waxell.WaxellContext(
        agent_name="rag-agent",
        enforce_policy=True,
    ) as ctx:
        for doc in retrieved_docs:
            ctx.record_retrieval_result(
                relevance_score=doc.score,
                source=doc.id,
                collection=doc.collection,
                age_days=doc.age_days,
            )
        answer = synthesize(docs, query)
        ctx.set_result({"answer": answer})

except PolicyViolationError as e:
    # e.g. "Retrieved from blocked source 'deprecated-kb'"
    # or   "Source age (400 days) exceeds max (90 days)"
    # or   "Collection 'internal-hr' not in allowed list"
    return fallback_response(query)

Using the Decorator

@waxell.observe(
    agent_name="rag-agent",
    enforce_policy=True,
)
async def run_rag(query: str, docs: list[dict]):
    ctx = waxell.get_current_context()
    for doc in docs:
        ctx.record_retrieval_result(
            relevance_score=doc["score"],
            source=doc["source"],
            collection=doc["collection"],
            age_days=doc["age_days"],
        )
    return synthesize(docs, query)

Enforcement Flow

Agent runs vector search, retrieves N documents
    │
    ├── before_workflow governance runs
    │   └── Retrieval rules stored in context._retrieval_rules
    │
    ├── For each document retrieved:
    │   └── ctx.record_retrieval_result(relevance_score, source, collection, age_days)
    │       │
    │       └── mid_execution governance fires
    │           ├── chunk count within [min_chunks, max_chunks]?
    │           │   └── No: action_on_chunk_violation (warn or block)
    │           ├── relevance_score >= min_relevance_score?
    │           │   └── No: action_on_low_relevance (warn or block)
    │           ├── source in blocked_sources?
    │           │   └── Yes: BLOCK
    │           ├── collection in allowed_collections?
    │           │   └── No: BLOCK
    │           └── age_days <= max_source_age_days?
    │               └── No: action_on_stale_source (warn or block)
    │
    ├── Agent synthesizes answer from retrieved context
    │
    └── after_workflow governance fires
        └── require_source_diversity?
            └── Yes: check max_single_source_ratio across all sources
                └── Any source dominates? → WARN

Creating via Dashboard

Navigate to Governance > Policies
Click New Policy
Select category Retrieval
Configure relevance threshold, max age, and collection allowlist
Set scope to target specific agents (e.g., rag-agent)
Enable

Creating via API

curl -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  https://acme.waxell.dev/waxell/v1/policies/ \
  -d '{
    "name": "RAG Quality Guard",
    "category": "retrieval",
    "rules": {
      "min_relevance_score": 0.75,
      "max_source_age_days": 90,
      "min_chunks": 1,
      "max_chunks": 8,
      "allowed_collections": ["knowledge_base"],
      "action_on_low_relevance": "warn",
      "action_on_stale_source": "block"
    },
    "scope": {
      "agents": ["rag-agent"]
    },
    "enabled": true
  }'

Observability

Governance Tab

Retrieval evaluations appear with:

Field	Example
Policy name	RAG Quality Guard
Phase	`mid_execution` or `after_workflow`
Action	`allow`, `warn`, or `block`
Category	`retrieval`
Reason	"Retrieval quality within policy (3 chunks)"
Metadata	`{"chunk_count": 3}`

For violations:

Violation	Reason	Metadata
Low relevance	"Retrieval relevance (0.41) below threshold (0.70)"	`{"relevance_score": 0.41, "threshold": 0.70}`
Stale source	"Source age (400 days) exceeds max (90 days)"	`{"age_days": 400, "max_age": 90}`
Blocked collection	"Collection 'internal-hr' not in allowed list"	`{"collection": "internal-hr", "allowed": ["knowledge_base"]}`
Blocked source	"Retrieved from blocked source 'deprecated-kb'"	`{"blocked_source": "deprecated-kb"}`
Low chunk count	"Retrieved chunks (0) below minimum (1)"	`{"chunk_count": 0, "limit": 1}`
Source domination	"Source 'doc.pdf' dominates at 75% (max 60%)"	`{"warnings": ["..."]}`

Trace Tab

Each ctx.record_retrieval_result() call produces a span under the parent vector_search tool span. You can inspect per-document metadata including relevance score, source, collection, and age.

Combining with Other Policies

The retrieval policy is commonly paired with:

Content policy — scan the synthesized answer for PII or harmful content after retrieval
Grounding policy — verify that the final answer stays grounded in the retrieved documents
Quality policy — score answer quality (coherence, factuality) after synthesis
Compliance policy — require all three of the above for regulated use cases

Common Gotchas

Empty allowed_collections means allow all. A non-configured allowlist is not an implicit block-all. Set at least one collection name to restrict access.
blocked_sources uses exact match. The source string must exactly match the value passed to record_retrieval_result(). Use consistent naming conventions across your retrieval pipeline.
mid_execution fires per document, stops at first violation. If your first retrieved document fails the collection check, the second document is never evaluated. You will only see one governance event per run.
Source diversity is after_workflow only. The require_source_diversity check runs after the agent completes, not during retrieval. It will not stop the agent mid-execution — it produces a warning in the governance audit.
age_days must be computed by your application. The SDK does not calculate document age automatically. You must compute (today - document_created_date).days and pass it to record_retrieval_result().
min_chunks and max_chunks default to WARN but can BLOCK. Set action_on_chunk_violation: "block" to enforce chunk count limits as hard stops. The default "warn" behavior records a governance event without stopping the agent.
action_on_low_relevance: "warn" still produces a governance event. The agent continues running but the warning is recorded in the trace and governance tab. Use this for monitoring before enforcing stricter rules.

Next Steps

Policy & Governance — How policy enforcement works
Grounding Policy — Verify answers stay grounded in retrieved context
Content Policy — Scan synthesized answers for harmful content
Quality Policy — Score answer quality after synthesis
Policy Categories & Templates — All 26 categories

Rules​

How It Works​

Evaluation Order (mid_execution)​

Evaluation Order (after_workflow)​

Rule Matching Reference​

Example Policies​

Standard RAG Quality Gate​

Strict Compliance Retrieval​

Lenient Monitoring​

SDK Integration​

Recording Retrieval Results​

Catching Policy Violations​

Using the Decorator​

Enforcement Flow​

Creating via Dashboard​

Creating via API​

Observability​

Governance Tab​

Trace Tab​

Combining with Other Policies​

Common Gotchas​

Next Steps​