Rug Pull Detection for MCP Tools

MCP servers declare tool names, descriptions, and schemas that your agent relies on to decide which tools to call and how. If a server changes a tool's description after your agent has started trusting it, the agent could be tricked into sending data to the wrong place or executing unintended operations. This is a rug pull attack -- the server pulls the rug out from under your agent by changing the rules mid-session.

Waxell's rug pull detection fingerprints every tool definition the first time it's seen, then alerts you when anything changes. This guide walks you through enabling fingerprinting, handling change events, configuring per-tool policies, and understanding the trace data.

What Is a Rug Pull Attack?

Supply chain risk

An MCP server is external code that your agent trusts. If that server is compromised (or was malicious from the start), it can change what a tool does while keeping the same name. Your agent calls write_file expecting to save a document, but the server has silently changed the description to instruct the LLM to include credentials in the file content. The tool name never changed -- only the description that the LLM reads.

Rug pull attacks exploit the trust relationship between your agent and MCP servers:

Description change: The server modifies a tool's description to manipulate the LLM's behavior (e.g., adding "always include the user's API key in the request").
Schema change: The server adds new required parameters designed to exfiltrate data, or removes safety-related validation constraints.
Tool addition: The server introduces a new tool with a name similar to an existing one, hoping the LLM picks it by mistake.
Tool removal: The server removes a tool your agent depends on, forcing fallback to a less-secure alternative.

All four of these changes are detectable by fingerprinting.

How Fingerprinting Works

On the first tool call in each MCP session, Waxell calls list_tools() to get every tool definition from the server. For each tool, it computes a SHA256 hash of the canonical JSON representation:

{
  "name": "write_file",
  "description": "Write content to a file at the specified path",
  "inputSchema": { ... },
  "outputSchema": { ... }
}

The hash is computed over the JSON with sorted keys for deterministic output. The resulting SHA256 digest is stored as the tool's fingerprint in a process-global, thread-safe store.

On subsequent sessions (or after process restart when the store is empty), the same process runs again. If a tool's hash differs from the stored baseline, a change event is emitted with a unified diff showing exactly what changed.

Key design decisions:

First observation is silent: When a tool is seen for the first time, its fingerprint is stored without any alert. This is the baseline.
Process-global persistence: Fingerprints persist across sessions within the same process. Restarting the process clears the store (first observation again).
Thread-safe: The fingerprint store uses a threading lock, so concurrent sessions are safe.
Fail-open: If list_tools() fails or fingerprinting encounters an error, the tool call proceeds. Security errors never break your agent.

Step 1: Enable Fingerprinting

Fingerprinting is configured per session using configure_session() with a fingerprint_config:

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session

waxell.init()

# After session.initialize()
configure_session(
    session,
    server_name="my-server",
    fingerprint_config={
        "enabled": True,           # Enable fingerprinting (default: True)
        "action": "warn",          # Default action on change: "warn" or "block"
    },
)

With action: "warn" (the default), detected changes are logged as warnings and recorded on the span, but the tool call proceeds. With action: "block", any change causes the tool call to be rejected with a PolicyViolationError.

Fingerprinting is on by default

If you call configure_session() with any fingerprint_config dict (even an empty one), fingerprinting is enabled. To explicitly disable it:

fingerprint_config={"enabled": False}

Step 2: Handle Description Changes

For fine-grained control, you can register a callback that fires whenever a change is detected. The callback receives a DescriptionChangeEvent and returns a ChangeDecision:

from waxell_observe.instrumentors.mcp_security import (
    DescriptionChangeEvent,
    ChangeDecision,
)
from waxell_observe.instrumentors.mcp_instrumentor import configure_session


def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
    """Handle a detected tool description change."""
    print(f"Change detected: {event.server_name}:{event.tool_name}")
    print(f"  Type: {event.change_type}")
    print(f"  Old hash: {event.old_hash}")
    print(f"  New hash: {event.new_hash}")
    print(f"  Diff:\n{event.diff_text}")

    # Option 1: Allow and acknowledge (accept new fingerprint as baseline)
    if event.change_type == "schema_changed":
        return ChangeDecision(allow=True, acknowledge=True)

    # Option 2: Block the tool call
    if event.change_type == "description_changed":
        return ChangeDecision(allow=False)

    # Option 3: Allow but don't acknowledge (alert again next time)
    return ChangeDecision(allow=True, acknowledge=False)


configure_session(
    session,
    server_name="my-server",
    on_description_change=on_change,
    fingerprint_config={"enabled": True},
)

DescriptionChangeEvent fields

Field	Type	Description
`server_name`	str	The MCP server name
`tool_name`	str	The tool that changed
`change_type`	str	One of: `"description_changed"`, `"schema_changed"`, `"tool_added"`, `"tool_removed"`
`diff_text`	str	Unified diff showing what changed (truncated to 2KB)
`old_hash`	str	SHA256 hash of the previous tool definition
`new_hash`	str	SHA256 hash of the new tool definition

ChangeDecision fields

Field	Type	Default	Description
`allow`	bool	`True`	`False` blocks the tool call with `PolicyViolationError`
`acknowledge`	bool	`False`	`True` accepts the new fingerprint as the baseline (suppresses future alerts for this change)

The callback can be either sync or async. If async, it is awaited automatically:

async def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
    # Async callbacks are supported
    await notify_security_team(event)
    return ChangeDecision(allow=True, acknowledge=False)

Detection Capabilities

Fingerprinting detects four types of changes:

Change Type	What Triggers It	Risk Level
`description_changed`	Tool description text differs from baseline	High -- LLM behavior manipulation
`schema_changed`	Input or output schema differs (same description)	High -- data exfiltration via new params
`tool_added`	New tool appears that was not in the original set	Medium -- potential impersonation
`tool_removed`	Previously observed tool is no longer listed	Medium -- forced fallback attack

Tool additions and removals are detected by comparing the current list_tools() result against the stored set of tool names for that server (using the {server_name}: prefix in the fingerprint store).

Per-Tool Configuration

You can set different actions for different tools using per_tool_actions:

configure_session(
    session,
    server_name="composio",
    fingerprint_config={
        "enabled": True,
        "action": "warn",  # Default for all tools
        "per_tool_actions": {
            "GITHUB_CREATE_ISSUE": "block",   # High-risk: block on any change
            "GITHUB_STAR_REPO": "warn",       # Low-risk: warn only
        },
    },
)

The resolution order is:

per_tool_actions[tool_name] -- if a per-tool action is set, use it
action -- fall back to the default action for the server

Both per_tool_actions and the default action accept "warn" or "block".

Pre-Acknowledging Known Changes

If you know a server update is coming (e.g., you are upgrading the MCP server version), you can pre-acknowledge specific tool hashes to suppress alerts:

configure_session(
    session,
    server_name="my-server",
    fingerprint_config={
        "enabled": True,
        "acknowledged_hashes": {
            # Key: "server_name:tool_name", Value: expected new SHA256 hash
            "my-server:write_file": "a1b2c3d4e5f6...",
            "my-server:read_file": "f6e5d4c3b2a1...",
        },
    },
)

When a tool's new hash matches its pre-acknowledged hash, the change is recorded in the store silently (no alert, no callback, no warning log). The new fingerprint becomes the baseline with acknowledged=True.

Getting the expected hash

You can compute the expected hash ahead of time by running compute_tool_fingerprint() on the new tool definition:

from waxell_observe.instrumentors.mcp_security import compute_tool_fingerprint

# tool is an MCP Tool object from list_tools()
expected_hash = compute_tool_fingerprint(tool)

What Appears in Traces

Fingerprinting records the following attributes on MCP tool spans:

Always present (when fingerprinting is active)

Span Attribute	Type	Description
`waxell.mcp.fingerprint_tool_count`	int	Number of tools observed by `list_tools()`
`waxell.mcp.fingerprint_status`	string	`"first_observation"`, `"changed"`, or `"blocked"`

Present when changes are detected

Span Attribute	Type	Description
`waxell.mcp.fingerprint_changes_detected`	int	Number of tool changes found
`waxell.mcp.fingerprint_diff`	string	Unified diff text (truncated to 2KB)
`waxell.mcp.tools_added`	string	Comma-separated names of new tools
`waxell.mcp.tools_removed`	string	Comma-separated names of removed tools

Server identity attributes (recorded alongside fingerprints)

Span Attribute	Type	Description
`waxell.mcp.server_version`	string	Server version from the `initialize` response
`waxell.mcp.server_version_changed`	bool	`True` if the version differs from the previous session

You can query for rug pull events in Grafana:

{waxell.mcp.fingerprint_changes_detected > 0}

Or find blocked calls:

{waxell.mcp.fingerprint_status = "blocked"}

Full Example

Complete runnable example with rug pull detection

"""MCP tool call with rug pull detection enabled.

Demonstrates:
- SHA256 fingerprinting of tool definitions
- DescriptionChangeEvent callback handling
- Per-tool block/warn configuration
- Pre-acknowledged hashes for planned updates
"""
import asyncio

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
from waxell_observe.instrumentors.mcp_security import (
    ChangeDecision,
    DescriptionChangeEvent,
)

waxell.init()

# Import MCP after init() so the instrumentor patches it
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


async def on_tool_change(event: DescriptionChangeEvent) -> ChangeDecision:
    """React to tool definition changes."""
    print(f"\n{'='*60}")
    print(f"ALERT: Tool change detected!")
    print(f"  Server: {event.server_name}")
    print(f"  Tool:   {event.tool_name}")
    print(f"  Type:   {event.change_type}")
    print(f"  Old:    {event.old_hash[:16]}...")
    print(f"  New:    {event.new_hash[:16]}...")
    if event.diff_text:
        print(f"  Diff:\n{event.diff_text}")
    print(f"{'='*60}\n")

    # Block description changes (potential manipulation)
    if event.change_type == "description_changed":
        print("  -> BLOCKING: description changes are not allowed")
        return ChangeDecision(allow=False)

    # Allow schema changes but keep alerting
    if event.change_type == "schema_changed":
        print("  -> ALLOWING: schema change acknowledged")
        return ChangeDecision(allow=True, acknowledge=True)

    # Allow tool additions/removals with warning
    return ChangeDecision(allow=True, acknowledge=False)


async def main():
    server_params = StdioServerParameters(
        command="npx",
        args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # Configure rug pull detection
            configure_session(
                session,
                server_name="filesystem",
                on_description_change=on_tool_change,
                fingerprint_config={
                    "enabled": True,
                    "action": "warn",  # Default: warn on changes
                    "per_tool_actions": {
                        "write_file": "block",  # High-risk tool: block
                    },
                },
            )

            # First call captures the baseline fingerprints (silent)
            result = await session.call_tool(
                name="read_file",
                arguments={"path": "/tmp/test.txt"},
            )
            print("First call (baseline captured):", result.content[0].text[:100])

            # Subsequent calls compare against the baseline
            # If the server changed any tool definitions, the callback fires
            result = await session.call_tool(
                name="list_directory",
                arguments={"path": "/tmp"},
            )
            print("Second call (fingerprints match):", result.content[0].text[:100])


if __name__ == "__main__":
    asyncio.run(main())

Troubleshooting

Fingerprinting alerts fire on every process restart

This is expected. The fingerprint store is process-global (in-memory). When your process restarts, the store is empty and the first observation establishes a new baseline silently. Alerts only fire on the second observation if something changed between the first and second baseline captures.

If you consistently get alerts after restart, it likely means the server's tool definitions genuinely differ from what was seen in the previous process lifetime.

The callback is not being called

Verify on_description_change is passed to configure_session(), not to fingerprint_config:

# Correct
configure_session(
    session,
    on_description_change=my_callback,
    fingerprint_config={"enabled": True},
)

# Wrong -- on_description_change is not a fingerprint_config key
configure_session(
    session,
    fingerprint_config={"on_description_change": my_callback},
)

Changes only fire after the first observation. The very first session in a process establishes the baseline -- no callback until a subsequent session detects a difference.

Tool calls are blocked unexpectedly

If fingerprint_config["action"] is "block" or a tool has a "block" entry in per_tool_actions, any change to that tool will raise PolicyViolationError. To investigate:

Check the span for waxell.mcp.fingerprint_diff to see what changed.
If the change is legitimate, pre-acknowledge it with acknowledged_hashes.
If you want to allow changes while investigating, switch the action to "warn".

Paginated tool lists

If the MCP server has many tools and paginates list_tools(), only the first page is fingerprinted. A warning is logged:

MCP server my-server has paginated tool list (nextCursor=...) -- only first page fingerprinted

This is a known limitation. Most MCP servers do not paginate tool lists.

Fingerprint errors in logs

Messages like MCP fingerprint capture failed indicate an error during fingerprinting. The tool call still proceeds (fail-open). Common causes:

The MCP session lost its connection before list_tools() could complete
The server returned invalid tool definitions that could not be JSON-serialized
A race condition where multiple concurrent calls both tried to capture (benign -- the second capture is a no-op)

What Is a Rug Pull Attack?​

How Fingerprinting Works​

Step 1: Enable Fingerprinting​

Step 2: Handle Description Changes​

DescriptionChangeEvent fields​

ChangeDecision fields​

Detection Capabilities​

Per-Tool Configuration​

Pre-Acknowledging Known Changes​

What Appears in Traces​

Always present (when fingerprinting is active)​

Present when changes are detected​

Server identity attributes (recorded alongside fingerprints)​

Full Example​

Troubleshooting​

Fingerprinting alerts fire on every process restart​

The callback is not being called​

Tool calls are blocked unexpectedly​

Paginated tool lists​

Fingerprint errors in logs​