Skip to main content

Reasoning Policy

The reasoning policy category governs the quality and fairness of agent decisions. It enforces that decisions are explained, consider multiple alternatives, meet confidence thresholds, and are free from bias in protected attributes.

Use it when you need auditability for agent decision-making — routing, classification, prioritization, or any step where your agent makes a choice that affects users or downstream systems.

Rules

RuleTypeDefaultDescription
require_explanationbooleanfalseRequire a non-empty explanation for every recorded decision
explanation_min_lengthinteger50Minimum character length for decision explanations
require_alternatives_consideredbooleanfalseRequire that multiple options were evaluated
min_alternativesinteger2Minimum number of alternatives that must be listed
confidence_requiredbooleanfalseRequire a confidence score on every decision
min_decision_confidencenumber0.7Minimum acceptable confidence (0.0–1.0)
bias_detection.enabledbooleanfalseEnable bias detection on reasoning outputs
bias_detection.protected_attributesstring[][]Attributes to monitor (e.g. ["gender", "race", "age"])
bias_detection.actionstring"warn"Action on bias detected: "warn" or "block"
decision_audit_trailbooleanfalseWarn if no decisions were recorded during the workflow
max_reasoning_depthinteger10Maximum reasoning chain depth
action_on_violationstring"warn"Default action for explanation/alternatives/confidence violations: "warn" or "block"

How It Works

The reasoning handler runs at two phases:

  • mid_execution — checks recorded decisions. Governance evaluates buffered decision data at the next auto-instrumented event (LLM call, step) or at workflow completion.
  • after_workflow — checks the full session for bias flags and audit trail completeness after the agent finishes.

Evaluation Order (mid_execution)

  1. Check max_reasoning_depth against context.reasoning_depth
  2. For each decision in context.decisions:
    • require_explanation: explanation must exist and be at least explanation_min_length chars
    • require_alternatives_considered: options list must have at least min_alternatives entries
    • confidence_required: confidence must be present and >= min_decision_confidence
  3. Any violation → action_on_violation (warn or block)

Evaluation Order (after_workflow)

  1. bias_detection.enabled: if any bias_flags are recorded → bias_detection.action
  2. decision_audit_trail: if enabled and no decisions recorded → WARN

What Gets Checked

The reasoning handler reads context.decisions — a list populated by ctx.record_decision(). Each entry looks like:

{
"decision": "route_query", # decision name
"explanation": "Query is factual...", # maps from reasoning parameter
"alternatives": ["search", "generate", "clarify"], # maps from options
"confidence": 0.9,
}

Note: record_decision(reasoning=...) populates explanation, and record_decision(options=...) populates alternatives. The parameter names differ from the internal field names checked by the handler.

Rule Reference

CheckConditionResult
explanation "", min_length 500 < 50action_on_violation
explanation "Query is factual, high confidence" (35 chars), min_length 5035 < 50action_on_violation
explanation 80 chars, min_length 5080 ≥ 50Pass
1 alternative, min_alternatives 21 < 2action_on_violation
3 alternatives, min_alternatives 23 ≥ 2Pass
confidence 0.45, min 0.700.45 < 0.70action_on_violation
confidence 0.85, min 0.700.85 ≥ 0.70Pass
bias_flag "gender_bias", bias_detection.action "block"Flag presentBLOCK at after_workflow
Bias Detection Runs at after_workflow, Not mid_execution

record_bias_flag() records flags during execution, but the bias check runs after the agent completes. This means bias detection never stops the agent mid-stream — it produces a block or warning in the governance audit after the workflow finishes.

action_on_violation Applies to Explanation, Alternatives, and Confidence

The single action_on_violation rule controls all three mid_execution checks. You cannot set different actions for each violation type. Use bias_detection.action separately for bias control.

Example Policies

Minimal Explainability

Require explanations for all decisions without blocking:

{
"require_explanation": true,
"explanation_min_length": 30,
"action_on_violation": "warn"
}

Full Decision Audit

Enforce explanation, alternatives, and confidence for regulated decision-making:

{
"require_explanation": true,
"explanation_min_length": 50,
"require_alternatives_considered": true,
"min_alternatives": 2,
"confidence_required": true,
"min_decision_confidence": 0.75,
"decision_audit_trail": true,
"action_on_violation": "block"
}

Bias Detection with Block

Block any workflow where a protected attribute bias was flagged:

{
"bias_detection": {
"enabled": true,
"protected_attributes": ["gender", "race", "age", "religion"],
"action": "block"
},
"require_explanation": true,
"explanation_min_length": 50,
"action_on_violation": "warn"
}

Monitoring Mode

Record everything without blocking — useful during initial rollout:

{
"require_explanation": true,
"explanation_min_length": 20,
"require_alternatives_considered": true,
"min_alternatives": 2,
"confidence_required": true,
"min_decision_confidence": 0.6,
"bias_detection": {
"enabled": true,
"protected_attributes": ["gender", "race"],
"action": "warn"
},
"decision_audit_trail": true,
"action_on_violation": "warn"
}

SDK Integration

Recording Decisions

Call ctx.record_decision() for each decision point in your agent. The reasoning handler reads context.decisions — a list populated by these calls.

import waxell_observe as waxell
from waxell_observe.errors import PolicyViolationError

async with waxell.WaxellContext(
agent_name="routing-agent",
enforce_policy=True,
) as ctx:
# Record a routing decision
ctx.record_decision(
name="route_query",
options=["search", "generate", "clarify"],
chosen="search",
reasoning="Query is factual and requires retrieval from knowledge base. Direct generation would risk hallucination.",
confidence=0.9,
)

# Governance evaluates the decision at the next auto-instrumented
# event (e.g., the LLM call below) or at workflow completion
result = await execute_search(query)
ctx.set_result(result)

Recording Reasoning Steps

Use ctx.record_reasoning() to capture chain-of-thought steps. These appear in the trace but do not trigger mid_execution governance — they are supporting evidence for the decisions.

async with waxell.WaxellContext(
agent_name="routing-agent",
enforce_policy=True,
) as ctx:
# Record chain-of-thought first
ctx.record_reasoning(
step="evaluate",
thought="The evidence supports a factual retrieval approach over direct generation.",
evidence=["query_type: factual", "knowledge_base: available", "confidence: high"],
conclusion="Proceed with search",
)

# Then record the decision
ctx.record_decision(
name="route_query",
options=["search", "generate", "clarify"],
chosen="search",
reasoning="Query is factual and requires retrieval from knowledge base.",
confidence=0.9,
)

# Governance evaluates the decision at the next auto-instrumented event
# or at workflow completion

Recording Bias Flags

Call ctx.record_bias_flag() when your application detects a bias signal. The reasoning handler checks context.bias_flags at after_workflow.

async with waxell.WaxellContext(
agent_name="hiring-agent",
enforce_policy=True,
) as ctx:
result = await evaluate_candidate(application)

# Your bias detection logic
if detected_gender_reference(result):
ctx.record_bias_flag(flag="gender_bias")

# Record the decision separately
ctx.record_decision(
name="candidate_evaluation",
options=["advance", "reject", "hold"],
chosen=result.recommendation,
reasoning=result.explanation,
confidence=result.confidence,
)

ctx.set_result(result)
# after_workflow governance fires on __aexit__
# If bias_flags is non-empty and bias_detection.action is "block" → PolicyViolationError

Catching Policy Violations

try:
async with waxell.WaxellContext(
agent_name="routing-agent",
enforce_policy=True,
) as ctx:
ctx.record_decision(
name="route_query",
options=["search"], # Only 1 option — violates min_alternatives: 2
chosen="search",
reasoning="", # Empty — violates require_explanation
confidence=0.45, # Below min_decision_confidence: 0.7
)
result = await do_work()
ctx.set_result(result)

except PolicyViolationError as e:
# e.g. "Decision explanation too short (0/50 chars)"
# or "Alternatives considered (1) below minimum (2)"
# or "Decision confidence (0.45) below threshold (0.70)"
# or "Bias detected: gender_bias"
return fallback_response()

Enforcement Flow

Agent starts (WaxellContext.__aenter__)

└── before_workflow governance runs
└── Reasoning rules stored in context._reasoning_rules

Agent makes a decision

├── ctx.record_decision(name, options, chosen, reasoning, confidence)
│ └── Data buffered for governance evaluation

└── Next auto-instrumented event (LLM call, step) or workflow completion

└── mid_execution governance fires
├── reasoning_depth > max_reasoning_depth? → action_on_violation
├── require_explanation?
│ └── explanation length < explanation_min_length? → action_on_violation
├── require_alternatives_considered?
│ └── len(options) < min_alternatives? → action_on_violation
└── confidence_required?
└── confidence < min_decision_confidence? → action_on_violation

Agent may call ctx.record_bias_flag(flag=...) if bias detected

Agent finishes (WaxellContext.__aexit__)

└── after_workflow governance fires
├── bias_detection.enabled?
│ └── bias_flags non-empty? → bias_detection.action (warn or block)
└── decision_audit_trail?
└── no decisions recorded? → WARN

Creating via Dashboard

  1. Navigate to Governance > Policies
  2. Click New Policy
  3. Select category Reasoning
  4. Enable require_explanation and set explanation_min_length
  5. Enable require_alternatives_considered and set min_alternatives
  6. Enable confidence_required and set min_decision_confidence
  7. Configure bias_detection if needed
  8. Set scope to target specific agents (e.g., routing-agent)
  9. Enable

Creating via API

curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://acme.waxell.dev/waxell/v1/policies/ \
-d '{
"name": "Decision Explainability",
"category": "reasoning",
"rules": {
"require_explanation": true,
"explanation_min_length": 50,
"require_alternatives_considered": true,
"min_alternatives": 2,
"confidence_required": true,
"min_decision_confidence": 0.75,
"bias_detection": {
"enabled": true,
"protected_attributes": ["gender", "race", "age"],
"action": "warn"
},
"decision_audit_trail": true,
"action_on_violation": "warn"
},
"scope": {
"agents": ["routing-agent"]
},
"enabled": true
}'

Observability

Governance Tab

Reasoning evaluations appear with:

FieldExample
Policy nameDecision Explainability
Phasemid_execution or after_workflow
Actionallow, warn, or block
Categoryreasoning
Reason"Reasoning within policy (2 decisions)"
Metadata{"decision_count": 2}

For violations:

ViolationReasonMetadata
Short explanation"Decision explanation too short (0/50 chars)"{"explanation_length": 0, "min_length": 50}
Too few alternatives"Alternatives considered (1) below minimum (2)"{"alternatives_count": 1, "min_required": 2}
Low confidence"Decision confidence (0.45) below threshold (0.70)"{"confidence": 0.45, "threshold": 0.70}
Bias detected"Bias detected: gender_bias"{"bias_flags": ["gender_bias"], "protected_attributes": [...]}
No audit trail"Decision audit trail enabled but no decisions recorded"{"warnings": [...]}

Trace Tab

Each ctx.record_decision() call produces a decision span with:

  • input_data: {"options": [...], "reasoning": "..."}
  • output_data: {"chosen": "...", "confidence": 0.9}

Each ctx.record_reasoning() call produces a reasoning span with the thought, evidence, and conclusion.

Combining with Other Policies

The reasoning policy is commonly paired with:

  • Audit policy — log all decisions and reasoning steps for compliance records
  • Content policy — scan the final response for harmful content after decisions are made
  • Retrieval policy — ensure retrieved evidence meets quality standards before it informs decisions
  • Compliance policy — require reasoning + audit + content policies together for regulated workflows

Common Gotchas

  1. record_decision(options=...) maps to alternatives internally. The handler checks decision["alternatives"], which is populated from the options parameter you pass to record_decision(). Passing a single-item options list will fail the min_alternatives check even if your agent considered other paths.

  2. record_decision(reasoning=...) maps to explanation internally. The handler checks decision["explanation"], which comes from the reasoning parameter. An empty reasoning="" is an empty explanation.

  3. Bias detection is after_workflow only. record_bias_flag() records the flag immediately, but the governance check runs when the agent finishes. A bias block raises PolicyViolationError from WaxellContext.__aexit__, not from the record_bias_flag() call itself.

  4. action_on_violation applies to all three mid_execution checks. You cannot block on missing explanation but warn on low confidence — it is a single setting for all three. Use bias_detection.action separately for bias control.

  5. decision_audit_trail is a soft check. When enabled and no decisions were recorded, it produces a WARN rather than a block. It is designed to catch instrumentation gaps, not to enforce decision recording.

  6. max_reasoning_depth checks context.reasoning_depth, not the number of record_reasoning() calls. The depth counter is incremented by the SDK's internal delegation depth tracking, not by manual reasoning steps.

  7. Confidence is optional per decision. If confidence_required: true but a specific decision has confidence=None, the confidence check is skipped for that decision. Only non-None confidence values below the threshold trigger a violation.

  8. record_decision() buffers data for later evaluation. Decision governance fires at the next auto-instrumented event (LLM call, tool call, step recording) or at workflow completion. If your agent makes a decision and then immediately makes an LLM call, governance evaluates the decision data at that point.

  9. record_bias_flag() flushes immediately. Unlike record_decision(), calling record_bias_flag() sends data to the controlplane right away. However, bias detection is checked at after_workflow, not mid_execution, so the timing rarely matters in practice.

Next Steps