PII and Secret Scanning for MCP Tools

When your agent calls MCP tools, the arguments it sends and the results it receives may contain sensitive data -- social security numbers, API keys, credit card numbers, or other personally identifiable information. PII scanning intercepts these tool calls automatically, detecting sensitive patterns in both directions and letting you block, warn, or redact before data reaches an external server.

This guide walks you through enabling PII scanning, configuring per-type actions, building a custom scanner, and understanding what appears in your traces.

How PII Scanning Works

PII scanning runs at two points during every MCP tool call:

Input scanning (before execution): Inspects the tool's arguments before the call is made. If a blocking-level finding is detected, the tool call is stopped entirely -- the MCP server never sees the data.
Output scanning (after execution): Inspects the tool's result text after the call returns. Because the tool has already executed, output scanning warns but does not block -- you cannot un-send data that was already processed by the server.

Input vs output asymmetry

Input scanning can block a tool call because the data has not left your process yet. Output scanning can only warn because blocking would discard a result that was already computed. This asymmetry is intentional -- it gives you defense-in-depth without throwing away valid results.

Both scans truncate text to 4KB before pattern matching for performance. The scan results are recorded on the span as attributes so you can query for PII events across all your traces.

Detectable PII Types

The built-in RegexPIIScanner detects 11 types of sensitive data across two categories:

Personal Information (PII)

Type	Pattern	Example Match
`ssn`	`\d{3}-\d{2}-\d{4}`	`123-45-6789`
`email`	Standard email format	`user@example.com`
`phone`	US phone numbers with optional +1	`(555) 123-4567`
`credit_card`	16-digit card numbers with optional separators	`4111-1111-1111-1111`

Credentials and Secrets

Type	Pattern	Example Match
`password`	`password=`, `pwd:`, `passwd=` followed by value	`password=s3cret`
`api_key`	`api_key=`, `api-secret:` followed by value	`api_key=abc123xyz`
`secret`	`secret_key=`, `access_key=`, `client_secret=`	`client_secret=xyz789`
`aws_key`	AWS access key IDs (`AKIA` prefix + 16 chars)	`AKIAIOSFODNN7EXAMPLE`
`generic_token`	Tokens with known prefixes (`sk-`, `pk_live_`, etc.)	`sk-abc123def456ghi789jkl`
`github_pat`	GitHub personal access tokens (`ghp_` prefix)	`ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
`waxell_key`	Waxell API keys (`wax_sk_` prefix)	`wax_sk_abc123`

PII Actions

Each PII type can be assigned one of three actions. When multiple types are detected, the highest severity action is recorded on the span:

Action	Severity	Behavior
`warn`	0 (lowest)	Log a warning, record finding on span, continue with tool call
`redact`	1	Replace detected pattern with `##TYPE##` placeholder (e.g., `##SSN##`)
`block`	2 (highest)	Stop the tool call entirely, raise `PolicyViolationError`

The default action for all types is warn. You override actions per type using the pii_actions dictionary.

Step 1: Enable PII Scanning

PII scanning is part of MCP governance. Enable it by calling configure_session() with a governance_config that includes pii_actions:

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session

waxell.init()

# After session.initialize()
configure_session(
    session,
    server_name="my-server",
    governance_config={
        "agent_name": "my-agent",
        "scan_inputs": True,     # Scan tool arguments (default: True)
        "scan_outputs": True,    # Scan tool results (default: True)
        "pii_actions": {
            "ssn": "block",          # Block calls containing SSNs
            "credit_card": "block",  # Block calls containing credit cards
            "email": "warn",         # Warn on email addresses
            "api_key": "block",      # Block calls containing API keys
            "aws_key": "block",      # Block AWS access keys
        },
    },
)

With this configuration:

A tool call with {"query": "SSN 123-45-6789"} in the arguments is blocked before it reaches the server.
A tool call with {"query": "contact user@example.com"} proceeds but logs a warning and records the finding on the span.
A tool result containing an AWS key triggers a warning in the output scan (output scans never block).

Minimal setup

If you want all-warn scanning (detect but never block), pass an empty pii_actions:

governance_config={
    "agent_name": "my-agent",
    "pii_actions": {},  # All types default to "warn"
}

Step 2: Use a Custom PII Scanner

The built-in RegexPIIScanner covers common patterns. For more advanced detection -- ML-based NER, domain-specific patterns, or integration with tools like Presidio or Microsoft Presidio -- you can provide your own scanner.

A custom scanner must implement the PIIScanner protocol:

from waxell_observe.scanning import PIIScanner

class MyScanner:
    """Custom PII scanner using Presidio."""

    def scan(self, text: str) -> dict:
        """Scan text and return findings.

        Must return:
            {
                "detected": bool,
                "count": int,
                "findings": [
                    {"type": "ssn", "category": "pii", "action": "block"},
                    {"type": "email", "category": "pii", "action": "warn"},
                ]
            }
        """
        # Your detection logic here
        findings = self._run_presidio(text)
        return {
            "detected": bool(findings),
            "count": len(findings),
            "findings": findings,
        }

Each finding dict must have:

type (str): The PII type name (e.g., "ssn", "email", "custom_id")
category (str): "pii" or "credential"
action (str): "warn", "block", or "redact"

Pass your custom scanner in the governance config:

configure_session(
    session,
    server_name="my-server",
    governance_config={
        "agent_name": "my-agent",
        "pii_scanner": MyScanner(),  # Your custom scanner
    },
)

When pii_scanner is provided, it replaces the default RegexPIIScanner entirely. If you want to combine them, call both inside your custom scanner's scan() method:

from waxell_observe.scanning import RegexPIIScanner

class CombinedScanner:
    def __init__(self):
        self._regex = RegexPIIScanner(actions={"ssn": "block"})

    def scan(self, text: str) -> dict:
        # Run built-in regex scanner
        regex_result = self._regex.scan(text)
        # Run your own custom logic
        custom_findings = self._detect_custom_patterns(text)
        # Merge findings
        all_findings = regex_result["findings"] + custom_findings
        return {
            "detected": bool(all_findings),
            "count": len(all_findings),
            "findings": all_findings,
        }

Runtime checkable protocol

PIIScanner is a runtime_checkable Protocol. You do not need to inherit from it -- any class with a scan(text: str) -> dict method works.

What Appears in Traces

When PII scanning runs, the following attributes are recorded on the MCP tool span:

Span Attribute	Type	Description
`waxell.mcp.pii_detected`	bool	`True` if any PII or credentials were found
`waxell.mcp.pii_count`	int	Total number of findings across input and output scans
`waxell.mcp.pii_scan_direction`	string	Which scan found PII: `"inputs"`, `"outputs"`, or `"both"`
`waxell.mcp.pii_action_taken`	string	Highest-severity action across all findings: `"warn"`, `"block"`, or `"redact"`
`waxell.mcp.governance_checked`	bool	`True` when governance ran (always true when PII scanning is active)
`waxell.mcp.governance_timestamp`	string	ISO 8601 timestamp of the governance check

PII values are never on spans

The actual PII values (SSN digits, email addresses, etc.) are intentionally not recorded on spans. Only summary attributes (detected/count/action) are stored. This prevents your tracing backend from becoming a PII liability.

You can query for PII events in Grafana using TraceQL:

{waxell.mcp.pii_detected = true}

Or filter by action:

{waxell.mcp.pii_action_taken = "block"}

Full Example

Complete runnable example with PII scanning

"""MCP tool call with PII scanning enabled.

Demonstrates:
- Blocking SSNs and credit cards in tool inputs
- Warning on email addresses
- Output scanning (warn-only)
"""
import asyncio

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
from waxell_observe.scanning import RegexPIIScanner

waxell.init()

# Import MCP after init() so the instrumentor patches it
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


async def main():
    server_params = StdioServerParameters(
        command="npx",
        args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Configure PII scanning
            configure_session(
                session,
                server_name="filesystem",
                governance_config={
                    "agent_name": "file-assistant",
                    "scan_inputs": True,
                    "scan_outputs": True,
                    "pii_actions": {
                        "ssn": "block",
                        "credit_card": "block",
                        "email": "warn",
                        "api_key": "block",
                        "aws_key": "block",
                        "github_pat": "block",
                    },
                },
            )

            # This call succeeds -- no PII in arguments
            result = await session.call_tool(
                name="read_file",
                arguments={"path": "/tmp/readme.txt"},
            )
            print("Safe call succeeded:", result.content[0].text[:100])

            # This call is BLOCKED -- SSN in arguments
            try:
                result = await session.call_tool(
                    name="write_file",
                    arguments={
                        "path": "/tmp/output.txt",
                        "content": "User SSN: 123-45-6789",
                    },
                )
            except Exception as e:
                print(f"Blocked as expected: {e}")


if __name__ == "__main__":
    asyncio.run(main())

Troubleshooting

PII scanning is not blocking anything

Check that governance_config is set. PII scanning only runs when governance_config is passed to configure_session(). Without it, no scanning occurs.
Check the action level. The default action is "warn", not "block". You must explicitly set "block" for types you want to stop:
```
"pii_actions": {"ssn": "block", "credit_card": "block"}
```
Check that scan_inputs is not disabled. It defaults to True, but if you set "scan_inputs": False, input scanning is skipped.

Output scanning detected PII but didn't block

This is expected behavior. Output scanning runs after the tool has already executed, so it cannot block. It records a warning on the span and logs the finding. To prevent sensitive data from being returned, block the input patterns that would cause the server to fetch sensitive data.

Custom scanner is not being used

Make sure you pass the scanner instance in governance_config["pii_scanner"], not pii_actions:

# Correct
governance_config={"pii_scanner": MyScanner()}

# Wrong -- this creates a RegexPIIScanner with these actions
governance_config={"pii_actions": {"ssn": "block"}}

When pii_scanner is provided, pii_actions is ignored. The custom scanner is responsible for its own action mapping.

Scanner errors are silently ignored

PII scanning is fail-open by design. If your custom scanner raises an exception, the tool call proceeds as if no PII was detected, and a warning is logged:

MCP PII scan failed: <error> -- treating as no PII (fail-open)

Check your application logs for these warnings. Fix the scanner bug, then redeploy.

TraceQL queries return no PII results

PII attributes are only set when PII is actually detected. If waxell.mcp.pii_detected is absent from a span, it means no PII was found (not that scanning was skipped). To verify scanning is active, check for waxell.mcp.governance_checked = true on the span.

How PII Scanning Works​

Detectable PII Types​

Personal Information (PII)​

Credentials and Secrets​

PII Actions​

Step 1: Enable PII Scanning​

Step 2: Use a Custom PII Scanner​

What Appears in Traces​

Full Example​

Troubleshooting​

PII scanning is not blocking anything​

Output scanning detected PII but didn't block​

Custom scanner is not being used​

Scanner errors are silently ignored​

TraceQL queries return no PII results​