Skip to main content

PII and Secret Scanning for MCP Tools

When your agent calls MCP tools, the arguments it sends and the results it receives may contain sensitive data -- social security numbers, API keys, credit card numbers, or other personally identifiable information. PII scanning intercepts these tool calls automatically, detecting sensitive patterns in both directions and letting you block, warn, or redact before data reaches an external server.

This guide walks you through enabling PII scanning, configuring per-type actions, building a custom scanner, and understanding what appears in your traces.

How PII Scanning Works

PII scanning runs at two points during every MCP tool call:

  1. Input scanning (before execution): Inspects the tool's arguments before the call is made. If a blocking-level finding is detected, the tool call is stopped entirely -- the MCP server never sees the data.
  2. Output scanning (after execution): Inspects the tool's result text after the call returns. Because the tool has already executed, output scanning warns but does not block -- you cannot un-send data that was already processed by the server.
Input vs output asymmetry

Input scanning can block a tool call because the data has not left your process yet. Output scanning can only warn because blocking would discard a result that was already computed. This asymmetry is intentional -- it gives you defense-in-depth without throwing away valid results.

Both scans truncate text to 4KB before pattern matching for performance. The scan results are recorded on the span as attributes so you can query for PII events across all your traces.

Detectable PII Types

The built-in RegexPIIScanner detects 11 types of sensitive data across two categories:

Personal Information (PII)

TypePatternExample Match
ssn\d{3}-\d{2}-\d{4}123-45-6789
emailStandard email formatuser@example.com
phoneUS phone numbers with optional +1(555) 123-4567
credit_card16-digit card numbers with optional separators4111-1111-1111-1111

Credentials and Secrets

TypePatternExample Match
passwordpassword=, pwd:, passwd= followed by valuepassword=s3cret
api_keyapi_key=, api-secret: followed by valueapi_key=abc123xyz
secretsecret_key=, access_key=, client_secret=client_secret=xyz789
aws_keyAWS access key IDs (AKIA prefix + 16 chars)AKIAIOSFODNN7EXAMPLE
generic_tokenTokens with known prefixes (sk-, pk_live_, etc.)sk-abc123def456ghi789jkl
github_patGitHub personal access tokens (ghp_ prefix)ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
waxell_keyWaxell API keys (wax_sk_ prefix)wax_sk_abc123

PII Actions

Each PII type can be assigned one of three actions. When multiple types are detected, the highest severity action is recorded on the span:

ActionSeverityBehavior
warn0 (lowest)Log a warning, record finding on span, continue with tool call
redact1Replace detected pattern with ##TYPE## placeholder (e.g., ##SSN##)
block2 (highest)Stop the tool call entirely, raise PolicyViolationError

The default action for all types is warn. You override actions per type using the pii_actions dictionary.

Step 1: Enable PII Scanning

PII scanning is part of MCP governance. Enable it by calling configure_session() with a governance_config that includes pii_actions:

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session

waxell.init()

# After session.initialize()
configure_session(
session,
server_name="my-server",
governance_config={
"agent_name": "my-agent",
"scan_inputs": True, # Scan tool arguments (default: True)
"scan_outputs": True, # Scan tool results (default: True)
"pii_actions": {
"ssn": "block", # Block calls containing SSNs
"credit_card": "block", # Block calls containing credit cards
"email": "warn", # Warn on email addresses
"api_key": "block", # Block calls containing API keys
"aws_key": "block", # Block AWS access keys
},
},
)

With this configuration:

  • A tool call with {"query": "SSN 123-45-6789"} in the arguments is blocked before it reaches the server.
  • A tool call with {"query": "contact user@example.com"} proceeds but logs a warning and records the finding on the span.
  • A tool result containing an AWS key triggers a warning in the output scan (output scans never block).
Minimal setup

If you want all-warn scanning (detect but never block), pass an empty pii_actions:

governance_config={
"agent_name": "my-agent",
"pii_actions": {}, # All types default to "warn"
}

Step 2: Use a Custom PII Scanner

The built-in RegexPIIScanner covers common patterns. For more advanced detection -- ML-based NER, domain-specific patterns, or integration with tools like Presidio or Microsoft Presidio -- you can provide your own scanner.

A custom scanner must implement the PIIScanner protocol:

from waxell_observe.scanning import PIIScanner

class MyScanner:
"""Custom PII scanner using Presidio."""

def scan(self, text: str) -> dict:
"""Scan text and return findings.

Must return:
{
"detected": bool,
"count": int,
"findings": [
{"type": "ssn", "category": "pii", "action": "block"},
{"type": "email", "category": "pii", "action": "warn"},
]
}
"""
# Your detection logic here
findings = self._run_presidio(text)
return {
"detected": bool(findings),
"count": len(findings),
"findings": findings,
}

Each finding dict must have:

  • type (str): The PII type name (e.g., "ssn", "email", "custom_id")
  • category (str): "pii" or "credential"
  • action (str): "warn", "block", or "redact"

Pass your custom scanner in the governance config:

configure_session(
session,
server_name="my-server",
governance_config={
"agent_name": "my-agent",
"pii_scanner": MyScanner(), # Your custom scanner
},
)

When pii_scanner is provided, it replaces the default RegexPIIScanner entirely. If you want to combine them, call both inside your custom scanner's scan() method:

from waxell_observe.scanning import RegexPIIScanner

class CombinedScanner:
def __init__(self):
self._regex = RegexPIIScanner(actions={"ssn": "block"})

def scan(self, text: str) -> dict:
# Run built-in regex scanner
regex_result = self._regex.scan(text)
# Run your own custom logic
custom_findings = self._detect_custom_patterns(text)
# Merge findings
all_findings = regex_result["findings"] + custom_findings
return {
"detected": bool(all_findings),
"count": len(all_findings),
"findings": all_findings,
}
Runtime checkable protocol

PIIScanner is a runtime_checkable Protocol. You do not need to inherit from it -- any class with a scan(text: str) -> dict method works.

What Appears in Traces

When PII scanning runs, the following attributes are recorded on the MCP tool span:

Span AttributeTypeDescription
waxell.mcp.pii_detectedboolTrue if any PII or credentials were found
waxell.mcp.pii_countintTotal number of findings across input and output scans
waxell.mcp.pii_scan_directionstringWhich scan found PII: "inputs", "outputs", or "both"
waxell.mcp.pii_action_takenstringHighest-severity action across all findings: "warn", "block", or "redact"
waxell.mcp.governance_checkedboolTrue when governance ran (always true when PII scanning is active)
waxell.mcp.governance_timestampstringISO 8601 timestamp of the governance check
PII values are never on spans

The actual PII values (SSN digits, email addresses, etc.) are intentionally not recorded on spans. Only summary attributes (detected/count/action) are stored. This prevents your tracing backend from becoming a PII liability.

You can query for PII events in Grafana using TraceQL:

{waxell.mcp.pii_detected = true}

Or filter by action:

{waxell.mcp.pii_action_taken = "block"}

Full Example

Complete runnable example with PII scanning
"""MCP tool call with PII scanning enabled.

Demonstrates:
- Blocking SSNs and credit cards in tool inputs
- Warning on email addresses
- Output scanning (warn-only)
"""
import asyncio

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
from waxell_observe.scanning import RegexPIIScanner

waxell.init()

# Import MCP after init() so the instrumentor patches it
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


async def main():
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()

# Configure PII scanning
configure_session(
session,
server_name="filesystem",
governance_config={
"agent_name": "file-assistant",
"scan_inputs": True,
"scan_outputs": True,
"pii_actions": {
"ssn": "block",
"credit_card": "block",
"email": "warn",
"api_key": "block",
"aws_key": "block",
"github_pat": "block",
},
},
)

# This call succeeds -- no PII in arguments
result = await session.call_tool(
name="read_file",
arguments={"path": "/tmp/readme.txt"},
)
print("Safe call succeeded:", result.content[0].text[:100])

# This call is BLOCKED -- SSN in arguments
try:
result = await session.call_tool(
name="write_file",
arguments={
"path": "/tmp/output.txt",
"content": "User SSN: 123-45-6789",
},
)
except Exception as e:
print(f"Blocked as expected: {e}")


if __name__ == "__main__":
asyncio.run(main())

Troubleshooting

PII scanning is not blocking anything

  1. Check that governance_config is set. PII scanning only runs when governance_config is passed to configure_session(). Without it, no scanning occurs.
  2. Check the action level. The default action is "warn", not "block". You must explicitly set "block" for types you want to stop:
    "pii_actions": {"ssn": "block", "credit_card": "block"}
  3. Check that scan_inputs is not disabled. It defaults to True, but if you set "scan_inputs": False, input scanning is skipped.

Output scanning detected PII but didn't block

This is expected behavior. Output scanning runs after the tool has already executed, so it cannot block. It records a warning on the span and logs the finding. To prevent sensitive data from being returned, block the input patterns that would cause the server to fetch sensitive data.

Custom scanner is not being used

Make sure you pass the scanner instance in governance_config["pii_scanner"], not pii_actions:

# Correct
governance_config={"pii_scanner": MyScanner()}

# Wrong -- this creates a RegexPIIScanner with these actions
governance_config={"pii_actions": {"ssn": "block"}}

When pii_scanner is provided, pii_actions is ignored. The custom scanner is responsible for its own action mapping.

Scanner errors are silently ignored

PII scanning is fail-open by design. If your custom scanner raises an exception, the tool call proceeds as if no PII was detected, and a warning is logged:

MCP PII scan failed: <error> -- treating as no PII (fail-open)

Check your application logs for these warnings. Fix the scanner bug, then redeploy.

TraceQL queries return no PII results

PII attributes are only set when PII is actually detected. If waxell.mcp.pii_detected is absent from a span, it means no PII was found (not that scanning was skipped). To verify scanning is active, check for waxell.mcp.governance_checked = true on the span.