Identity Policy
The identity policy category governs how agents represent themselves. It enforces two distinct rules:
- AI disclosure -- every agent response must include a configurable disclosure text identifying it as AI-generated
- Impersonation prevention -- agent outputs must not match patterns that claim human identity or professional roles
Use this policy when agents interact with end-users and transparency is required by your product standards, terms of service, or applicable regulations.
Rules
| Rule | Type | Default | Description |
|---|---|---|---|
require_ai_disclosure | boolean | true | Require the disclosure text to appear in the agent's output |
disclosure_text | string | "This response was generated by an AI assistant." | Exact text that must appear in the output (substring match, case-insensitive) |
disclosure_position | string | "prepend" | Hint for placement: "prepend", "append", or "footer". Used in warning metadata only -- the handler does not inject text |
prevent_impersonation | boolean | true | Block outputs that match configured impersonation patterns |
impersonation_patterns | string[] | see defaults | Substring patterns that indicate the agent is claiming to be human |
agent_identity_header | boolean | true | Tag governance metadata with an identity header marker for trace inspection |
Default Impersonation Patterns
"I am a human"
"I am not an AI"
"I am not a bot"
"I am a real person"
Any string in impersonation_patterns is checked as a case-insensitive substring against the output text.
How It Works
The identity handler runs at all three enforcement phases, with different checks at each:
| Phase | What It Checks | Violation Action |
|---|---|---|
before_workflow | Stores identity rules into context | Always ALLOW (setup only) |
mid_execution | Scans intermediate_outputs for impersonation patterns | BLOCK on match |
after_workflow | Checks output_text for impersonation (BLOCK) and AI disclosure (WARN) |
Enforcement Detail
mid_execution: Each time ctx.record_intermediate_output(output=...) is called, the handler scans the accumulated list of intermediate outputs. If any output matches an impersonation pattern, the agent is blocked immediately.
after_workflow: Runs two checks in sequence:
- Impersonation check on
output_text-- BLOCK if matched (same patterns, highest priority) - Disclosure check on
output_text-- WARN ifdisclosure_textis not found as a substring
Note: disclosure violations produce WARN, not BLOCK. The agent completes but a governance incident is recorded. Impersonation violations always produce BLOCK.
There is no configurable action_on_violation field -- impersonation is always blocked, and disclosure is always warned.
Pattern Matching
| Pattern in Config | Output Text | Result |
|---|---|---|
"I am a human" | "Hello! I am a human customer rep with 5 years experience." | BLOCK -- substring match (case-insensitive) |
"I am not an AI" | "I am not an ai, I promise." | BLOCK -- case-insensitive match |
"I am a human" | "I am an AI assistant here to help." | ALLOW -- pattern not found |
"I am a doctor" | "As your doctor, I recommend..." | BLOCK -- matches custom pattern |
Disclosure matching works the same way:
Configured disclosure_text | Output | Result |
|---|---|---|
"This response was generated by an AI assistant." | "...answer text...\n\nThis response was generated by an AI assistant." | ALLOW |
"This response was generated by an AI assistant." | "...answer text only..." | WARN |
"AI-generated content" | "Note: AI-generated content below." | ALLOW |
The identity handler checks for disclosure text in the output -- it does not add it automatically. Your agent code (or your LLM prompt) must include the configured disclosure_text in every response. Use ctx.set_output_text(text=...) to tell the handler what text to check.
SDK Integration
Using the Context Manager
import waxell_observe as waxell
from waxell_observe.errors import PolicyViolationError
waxell.init()
try:
async with waxell.WaxellContext(
agent_name="support-agent",
enforce_policy=True,
) as ctx:
# Generate intermediate reasoning
reasoning = await think_about(query)
# Record it -- mid_execution checks this for impersonation
ctx.record_intermediate_output(output=reasoning)
# Generate final response (must include disclosure text)
response = await generate_response(query)
response_with_disclosure = (
response + "\n\nThis response was generated by an AI assistant."
)
# Set output text -- after_workflow checks this
ctx.set_output_text(text=response_with_disclosure)
ctx.set_result({"response": response_with_disclosure})
except PolicyViolationError as e:
print(f"Identity block: {e}")
# e.g. "Impersonation detected in intermediate output: matched pattern 'I am a human'"
What Gets Checked
| SDK Call | What the Handler Reads | Phase |
|---|---|---|
ctx.record_intermediate_output(output=...) | context.intermediate_outputs | mid_execution |
ctx.set_output_text(text=...) | context.output_text | after_workflow |
If set_output_text is never called, the handler falls back to str(result) from ctx.set_result().
Example Policies
Strict AI Disclosure with Human Impersonation Block
{
"require_ai_disclosure": true,
"disclosure_text": "This response was generated by an AI assistant.",
"disclosure_position": "append",
"prevent_impersonation": true,
"impersonation_patterns": [
"I am a human",
"I am not an AI",
"I am not a bot",
"I am a real person"
],
"agent_identity_header": true
}
Medical / Legal Role Impersonation Prevention
{
"require_ai_disclosure": true,
"disclosure_text": "This information is AI-generated and not professional advice.",
"disclosure_position": "footer",
"prevent_impersonation": true,
"impersonation_patterns": [
"I am a doctor",
"I am a lawyer",
"I am a licensed",
"As your physician",
"As your attorney",
"I am a human",
"I am not an AI"
],
"agent_identity_header": true
}
Disclosure Only (No Impersonation Check)
{
"require_ai_disclosure": true,
"disclosure_text": "Generated by AI.",
"disclosure_position": "footer",
"prevent_impersonation": false,
"impersonation_patterns": [],
"agent_identity_header": false
}
Enforcement Flow
before_workflow
│
└── Store rules into context._identity_rules → ALLOW
mid_execution (runs each time record_intermediate_output() is called)
│
├── prevent_impersonation disabled? → ALLOW
├── impersonation_patterns empty? → ALLOW
└── Scan each intermediate output for each pattern
├── Pattern found? → BLOCK
└── No match → ALLOW
after_workflow
│
├── Check output_text for impersonation (prevent_impersonation=true)
│ └── Pattern found? → BLOCK (highest priority)
│
├── Check output_text for disclosure text (require_ai_disclosure=true)
│ ├── disclosure_text not found in output? → WARN
│ └── Found → continue
│
├── Tag metadata with agent_identity_header (if enabled)
│
└── Any warnings? → WARN result
No warnings → ALLOW result
Creating via Dashboard
- Navigate to Governance > Policies
- Click New Policy
- Select category Identity
- Configure
disclosure_textto match what your prompts produce - Add impersonation patterns appropriate for your use case
- Set scope to target specific agents (e.g.,
support-agent) - Enable
Creating via API
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
https://acme.waxell.dev/waxell/v1/policies/ \
-d '{
"name": "AI Disclosure Policy",
"category": "identity",
"rules": {
"require_ai_disclosure": true,
"disclosure_text": "This response was generated by an AI assistant.",
"disclosure_position": "append",
"prevent_impersonation": true,
"impersonation_patterns": [
"I am a human",
"I am not an AI",
"I am not a bot",
"I am a real person"
],
"agent_identity_header": true
},
"scope": {
"agents": ["support-agent"]
},
"enabled": true
}'
Observability
Governance Tab
Identity evaluations appear with:
| Field | Example |
|---|---|
| Phase | mid_execution or after_workflow |
| Action | allow, warn, or block |
| Category | identity |
| Reason | "Identity audit passed" |
For impersonation blocks:
| Field | Example |
|---|---|
| Reason | "Impersonation detected in final output: matched pattern 'I am a human'" |
| Metadata | {"matched_pattern": "I am a human", "output_preview": "...first 200 chars..."} |
For disclosure warnings:
| Field | Example |
|---|---|
| Reason | "AI disclosure text not found in output" |
| Metadata | {"expected_disclosure": "This response was generated by an AI assistant.", "disclosure_position": "append"} |
Common Gotchas
-
The handler does not inject disclosure text. It only checks whether the text is present. Your LLM prompts must instruct the model to include the configured
disclosure_textverbatim. -
Disclosure violations produce WARN, not BLOCK. The agent is not stopped. Use impersonation prevention to block harmful outputs; use disclosure to audit compliance.
-
disclosure_positionis metadata only. Setting it to"append"does not move text around -- it appears in the governance incident metadata as a hint for your team. -
set_output_texttakes precedence overset_result. If you callctx.set_output_text(text=...), the handler uses that. If you only callctx.set_result(...), the handler falls back tostr(result). Always callset_output_textexplicitly. -
Pattern matching is case-insensitive substring. "I am a human" will match "i am a human assistant", "Hello, I am a human!", and any variation of case.
-
Empty
impersonation_patternsdisables impersonation checks entirely. Settingprevent_impersonation: truewith no patterns produces no blocks. -
mid_execution fires on every intermediate output call. If your agent produces many intermediate outputs, the first one that matches a pattern will block immediately. Subsequent outputs are not checked.
Next Steps
- Policy & Governance -- How policy enforcement works
- Memory Policy -- Govern what agents store between turns
- Content Policy -- Scan inputs and outputs for sensitive patterns
- Communication Policy -- Govern output delivery channels
- Policy Categories & Templates -- All 26 categories