Output Egress Format Policy
The output-egress-format policy enforces OWASP LLM05 (insecure output handling, exfiltration sub-control). It scans the agent's final output for shapes that suggest the agent is smuggling data out of the system in disguise:
- base64 blobs — large encoded chunks in plain output (most legitimate output is plain text).
- External URLs — references to hosts outside the tenant's allowlist (catches the "embed a webhook in the response" pattern).
- Data URIs — inline
data:image/...ordata:application/...payloads that could carry encoded sensitive data. - Unicode obfuscation — zero-width spaces, RTL overrides, and Cyrillic homoglyphs above a density threshold.
Pure regex / set-lookup, sub-millisecond per output. Tier 1 (cheap).
Rules
| Rule | Type | Default | Description |
|---|---|---|---|
block_base64 | boolean | true | Flag base64-shaped blobs |
min_base64_length | integer | 200 | Minimum chars before a base64 match counts |
block_external_urls | boolean | false | Enforce the URL allowlist (opt-in) |
allowed_url_domains | string[] | [] | Hostnames or suffix matches (e.g., example.com matches api.example.com) |
block_data_uri | boolean | true | Flag inline data:...;base64, URIs |
block_unicode_obfuscation | boolean | false | Flag zero-width chars and homoglyph density |
max_homoglyph_pct | number | 0.05 | Cyrillic-homoglyph density threshold (0–1) |
action_on_violation | string | "block" | "block" or "warn" |
Optional: scan_mid_execution (boolean, default false) — enables a mid-run check on response_preview for streaming surfaces.
How It Works
The handler runs at after_workflow (primary — needs the complete output). before_workflow is a no-op. mid_execution runs only when scan_mid_execution=true.
| Phase | What it checks | Source |
|---|---|---|
before_workflow | (no-op) | — |
mid_execution (opt-in) | Latest response_preview against all 4 detectors | context.response_preview |
after_workflow | Final output against all 4 detectors | context.output_text → response_preview → result (if str) |
Scan Order (short-circuits on first hit)
- Data URI — fastest regex.
- Base64 blob ≥
min_base64_length. - External URL not on the allowlist.
- Zero-width chars, then homoglyph density.
Context Attributes Read
| Attribute | Phase | Purpose |
|---|---|---|
context.output_text | after | Primary scan target |
context.response_preview | mid, after (fallback) | Streaming or per-step preview |
result (parameter) | after (last fallback) | When the result is a plain string |
Example Policy
Customer-support agent — outputs are plain prose; aggressively block exfil shapes and limit URLs to the company domain and a known partner:
{
"block_base64": true,
"min_base64_length": 200,
"block_external_urls": true,
"allowed_url_domains": ["acme.com", "trusted-partner.com"],
"block_data_uri": true,
"block_unicode_obfuscation": true,
"max_homoglyph_pct": 0.02,
"action_on_violation": "block"
}
SDK Integration
import waxell_observe as waxell
waxell.init()
@waxell.observe(agent_name="support-agent", enforce_policy=True)
async def answer(query: str) -> str:
# after_workflow: scans the returned string for egress shapes.
# If a base64 blob or non-allowlisted URL appears, the policy blocks.
return await generate_response(query)
Observability
| Field | Example |
|---|---|
| Category | output-egress-format |
| Action | block |
| Reason | "Output contains a base64-shaped blob (412 chars). Possible exfiltration." |
| Metadata | {"phase": "after", "signal": "base64_blob", "length": 412, "owasp": "LLM05"} |
External URL violation:
| Field | Example |
|---|---|
| Reason | "Output references external URL host 'evil-webhook.io' not on the allowlist." |
| Metadata | {"signal": "external_url", "host": "evil-webhook.io", "owasp": "LLM05"} |
Common Gotchas
- Base64 regex matches any 200+ char run of
[A-Za-z0-9+/]. This will false-positive on long hex hashes, JWT tokens, and some content IDs. Raisemin_base64_lengthor setblock_base64=falsefor agents that legitimately emit tokens. - The URL allowlist uses suffix-match by default. Entry
example.commatchesexample.comANDapi.example.com. To match only the apex, list it explicitly without subdomains and use exact comparison externally. block_external_urls=falseis the default. Until you populateallowed_url_domainsand flip the flag, URL scanning is disabled.- Empty
allowed_url_domainswithblock_external_urls=trueblocks ALL URLs. Includeshttps://acme.comif you forgot to add it. - Homoglyph detector only catches Cyrillic→Latin lookalikes. Greek (e.g.,
omicron) and other scripts are not scanned. scan_mid_executionis off by default. Streaming surfaces should enable it; non-streaming agents save CPU by leaving it off.- The handler reads
output_textfirst, then falls back. Make sure your runtime populatescontext.output_textfor the most accurate scan.
Next Steps
- Policy Categories — All 49 categories
- Content Policy — Configurable input/output content scanning (block/warn/redact)
- Network Policy — Outbound HTTP allowlists at the tool layer
- Prompt Injection Guard — Companion control on the input side (LLM01)