Skip to main content

NeMo Guardrails Agent

A multi-agent NVIDIA NeMo Guardrails pipeline that coordinates a nemo-runner (executes Colang 2.0 input and output rails via @waxell.tool(tool_type="guardrail_rail")) and a nemo-evaluator (reasons about rail results, determines allow/block/rephrase action, scores rail pass rate). Demonstrates topical safety, jailbreak detection, fact checking, and hallucination detection rails.

Environment variables

This example runs in dry-run mode by default (no API key needed). For live mode, set OPENAI_API_KEY, WAXELL_API_KEY, and WAXELL_API_URL.

Architecture

Key Code

NeMo rail execution tools

The tool_type="guardrail_rail" distinguishes NeMo rail operations from other guardrail frameworks in the trace.

@waxell.tool(tool_type="guardrail_rail")
def run_input_rail(rail_name: str, passed: bool, action: str, description: str) -> dict:
"""Execute a single NeMo input rail."""
return {
"framework": "nemo_guardrails",
"rail_type": "input",
"rail": rail_name,
"passed": passed,
"action": action,
"description": description,
}

@waxell.tool(tool_type="guardrail_rail")
def run_output_rail(rail_name: str, passed: bool, action: str, description: str) -> dict:
"""Execute a single NeMo output rail."""
return {
"framework": "nemo_guardrails",
"rail_type": "output",
"rail": rail_name,
"passed": passed,
"action": action,
}

Rail evaluation with reasoning and decision

The evaluator assesses combined rail results and decides the final action for the content.

@waxell.reasoning_dec(step="rail_evaluation")
async def evaluate_rail_results(input_results: list, output_results: list, total_rails: int) -> dict:
input_passed = sum(1 for r in input_results if r["passed"])
output_passed = sum(1 for r in output_results if r["passed"])
blocked = sum(1 for r in input_results + output_results if not r["passed"])

return {
"thought": f"Evaluated {total_rails} NeMo rails (Colang 2.0). ...",
"evidence": [
f"Input rails: {', '.join(r['rail'] for r in input_results)}",
f"Output rails: {', '.join(r['rail'] for r in output_results)}",
],
"conclusion": f"All {total_rails} rails passed" if blocked == 0 else f"{blocked} blocked",
}

What this demonstrates

  • @waxell.tool(tool_type="guardrail_rail") -- 5 NeMo rails (2 input, 3 output) recorded with per-rail pass/fail and descriptions.
  • @waxell.step_dec -- rails config preparation with Colang 2.0 metadata.
  • @waxell.decision -- rail mode selection (full_rails/input_only/output_only).
  • @waxell.reasoning_dec -- combined evaluation of all rail results.
  • waxell.decide() -- inline allow/block/rephrase action decision.
  • waxell.score() -- numeric rail pass rate plus boolean all_rails_passed.
  • Auto-instrumented LLM calls -- OpenAI response generation captured automatically.
  • Nested @waxell.observe -- orchestrator is parent; nemo-runner and nemo-evaluator are child agents.
  • NeMo Guardrails integration pattern -- shows how to wrap NeMo Colang rails with waxell-observe.

Run it

# Dry-run (no API key needed)
python -m app.demos.nemo_guardrails_agent --dry-run

# Live mode with OpenAI
OPENAI_API_KEY=sk-... python -m app.demos.nemo_guardrails_agent

Source

dev/waxell-dev/app/demos/nemo_guardrails_agent.py