Guardrails Agent

A multi-agent guardrails validation pipeline that coordinates a guardrails-runner (input validation, output validation, PII detection via @waxell.tool(tool_type="guardrail_validator")) and a guardrails-evaluator (reasoning about validation results, fix strategy decision, scoring). Built with OpenAI and waxell-observe decorator patterns.

Environment variables

This example runs in dry-run mode by default (no API key needed). For live mode, set OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL.

Architecture

Key Code

Guardrail validator tool decorators

The tool_type="guardrail_validator" value marks these as guardrails validation calls in the trace, showing which validators ran and whether they passed or required fixes.

@waxell.tool(tool_type="guardrail_validator")
def validate_input(query: str, validators: list) -> dict:
    """Run input validators against the query."""
    results = [{"validator": v, "passed": True, "action": "pass"} for v in validators]
    return {"framework": "guardrails_ai", "stage": "input", "all_passed": True, "results": results}

@waxell.tool(tool_type="guardrail_validator")
def validate_output(answer: str, validators: list) -> dict:
    """Run output validators against the generated answer."""
    results = [{"validator": v, "passed": True, "action": "pass"} for v in validators]
    return {"framework": "guardrails_ai", "stage": "output", "all_passed": True, "results": results}

@waxell.tool(tool_type="guardrail_validator")
def check_pii(text: str) -> dict:
    """Check text for PII and redact if found."""
    has_pii = "@" in text or "email" in text.lower()
    return {
        "framework": "guardrails_ai",
        "validator": "pii_detection",
        "passed": not has_pii,
        "action": "fix" if has_pii else "pass",
        "pii_found": ["email"] if has_pii else [],
    }

Validation level decision

The orchestrator decides how strict validation should be based on query content, recorded as a decision span.

@waxell.decision(
    name="select_validation_level",
    options=["strict", "standard", "relaxed"],
)
async def select_validation_level(query: str) -> dict:
    has_sensitive_terms = any(w in query.lower() for w in ["pii", "personal", "email"])
    has_safety_terms = any(w in query.lower() for w in ["safety", "guardrail", "protect"])

    if has_sensitive_terms:
        return {"chosen": "strict", "reasoning": "Query references sensitive data"}
    elif has_safety_terms:
        return {"chosen": "standard", "reasoning": "Safety-related query"}
    else:
        return {"chosen": "relaxed", "reasoning": "General query"}

Evaluator with reasoning and inline decision

The evaluator reasons about combined validation results, then decides on a fix strategy using the inline waxell.decide() convenience function.

@waxell.observe(agent_name="guardrails-evaluator", workflow_name="guardrails-evaluation")
async def run_guardrails_evaluator(validation_results, answer, waxell_ctx=None):
    waxell.tag("agent_role", "evaluator")
    waxell.tag("framework", "guardrails_ai")

    quality = await evaluate_validation_results(  # @reasoning
        input_result, output_result, pii_result
    )

    waxell.decide(
        "fix_strategy",
        chosen="auto_redact" if fixes_applied > 0 else "no_action",
        options=["auto_redact", "reject_output", "no_action"],
        reasoning=f"{fixes_applied} PII items found -- auto-redaction applied",
        confidence=0.92,
    )

    waxell.score("validation_pass_rate", 1.0, comment="All 7 validators passed or fixed")
    waxell.score("pii_clean", pii_result["passed"], data_type="boolean")
    return {"answer": answer, "validations_run": 7, "all_passed": True}

What this demonstrates

@waxell.tool(tool_type="guardrail_validator") -- three guardrails validation tools (input, output, PII) recorded with the guardrail_validator tool type for clear trace attribution.
@waxell.decision -- validation level selection (strict/standard/relaxed) based on query analysis, recorded with options and reasoning.
@waxell.reasoning_dec -- combined evaluation of all validation stages (input, output, PII) with structured thought, evidence, and conclusion.
waxell.decide() -- inline fix strategy decision (auto_redact/reject_output/no_action) using the top-level convenience function.
@waxell.step_dec -- validation config preparation recorded as an execution step.
waxell.score() with mixed types -- numeric pass rate plus boolean PII cleanliness score.
Auto-instrumented LLM calls -- OpenAI response generation captured automatically via waxell.init().
Nested @waxell.observe -- orchestrator is parent; guardrails-runner and guardrails-evaluator are child agents with automatic lineage.
Guardrails AI integration pattern -- shows how to wrap Guardrails AI validators with waxell-observe for full validation observability.

Run it

# Dry-run (no API key needed)
python -m app.demos.guardrails_agent --dry-run

# Live mode with OpenAI
OPENAI_API_KEY=sk-... python -m app.demos.guardrails_agent

Source

dev/waxell-dev/app/demos/guardrails_agent.py

Architecture​

Key Code​

Guardrail validator tool decorators​

Validation level decision​

Evaluator with reasoning and inline decision​

What this demonstrates​

Run it​

Source​