Skip to main content

Output Egress Format Policy

The output-egress-format policy enforces OWASP LLM05 (insecure output handling, exfiltration sub-control). It scans the agent's final output for shapes that suggest the agent is smuggling data out of the system in disguise:

  • base64 blobs — large encoded chunks in plain output (most legitimate output is plain text).
  • External URLs — references to hosts outside the tenant's allowlist (catches the "embed a webhook in the response" pattern).
  • Data URIs — inline data:image/... or data:application/... payloads that could carry encoded sensitive data.
  • Unicode obfuscation — zero-width spaces, RTL overrides, and Cyrillic homoglyphs above a density threshold.

Pure regex / set-lookup, sub-millisecond per output. Tier 1 (cheap).

Rules

RuleTypeDefaultDescription
block_base64booleantrueFlag base64-shaped blobs
min_base64_lengthinteger200Minimum chars before a base64 match counts
block_external_urlsbooleanfalseEnforce the URL allowlist (opt-in)
allowed_url_domainsstring[][]Hostnames or suffix matches (e.g., example.com matches api.example.com)
block_data_uribooleantrueFlag inline data:...;base64, URIs
block_unicode_obfuscationbooleanfalseFlag zero-width chars and homoglyph density
max_homoglyph_pctnumber0.05Cyrillic-homoglyph density threshold (0–1)
action_on_violationstring"block""block" or "warn"

Optional: scan_mid_execution (boolean, default false) — enables a mid-run check on response_preview for streaming surfaces.

How It Works

The handler runs at after_workflow (primary — needs the complete output). before_workflow is a no-op. mid_execution runs only when scan_mid_execution=true.

PhaseWhat it checksSource
before_workflow(no-op)
mid_execution (opt-in)Latest response_preview against all 4 detectorscontext.response_preview
after_workflowFinal output against all 4 detectorscontext.output_textresponse_previewresult (if str)

Scan Order (short-circuits on first hit)

  1. Data URI — fastest regex.
  2. Base64 blob ≥ min_base64_length.
  3. External URL not on the allowlist.
  4. Zero-width chars, then homoglyph density.

Context Attributes Read

AttributePhasePurpose
context.output_textafterPrimary scan target
context.response_previewmid, after (fallback)Streaming or per-step preview
result (parameter)after (last fallback)When the result is a plain string

Example Policy

Customer-support agent — outputs are plain prose; aggressively block exfil shapes and limit URLs to the company domain and a known partner:

{
"block_base64": true,
"min_base64_length": 200,
"block_external_urls": true,
"allowed_url_domains": ["acme.com", "trusted-partner.com"],
"block_data_uri": true,
"block_unicode_obfuscation": true,
"max_homoglyph_pct": 0.02,
"action_on_violation": "block"
}

SDK Integration

import waxell_observe as waxell

waxell.init()

@waxell.observe(agent_name="support-agent", enforce_policy=True)
async def answer(query: str) -> str:
# after_workflow: scans the returned string for egress shapes.
# If a base64 blob or non-allowlisted URL appears, the policy blocks.
return await generate_response(query)

Observability

FieldExample
Categoryoutput-egress-format
Actionblock
Reason"Output contains a base64-shaped blob (412 chars). Possible exfiltration."
Metadata{"phase": "after", "signal": "base64_blob", "length": 412, "owasp": "LLM05"}

External URL violation:

FieldExample
Reason"Output references external URL host 'evil-webhook.io' not on the allowlist."
Metadata{"signal": "external_url", "host": "evil-webhook.io", "owasp": "LLM05"}

Common Gotchas

  • Base64 regex matches any 200+ char run of [A-Za-z0-9+/]. This will false-positive on long hex hashes, JWT tokens, and some content IDs. Raise min_base64_length or set block_base64=false for agents that legitimately emit tokens.
  • The URL allowlist uses suffix-match by default. Entry example.com matches example.com AND api.example.com. To match only the apex, list it explicitly without subdomains and use exact comparison externally.
  • block_external_urls=false is the default. Until you populate allowed_url_domains and flip the flag, URL scanning is disabled.
  • Empty allowed_url_domains with block_external_urls=true blocks ALL URLs. Includes https://acme.com if you forgot to add it.
  • Homoglyph detector only catches Cyrillic→Latin lookalikes. Greek (e.g., omicron) and other scripts are not scanned.
  • scan_mid_execution is off by default. Streaming surfaces should enable it; non-streaming agents save CPU by leaving it off.
  • The handler reads output_text first, then falls back. Make sure your runtime populates context.output_text for the most accurate scan.

Next Steps