Skip to main content

Rug Pull Detection for MCP Tools

MCP servers declare tool names, descriptions, and schemas that your agent relies on to decide which tools to call and how. If a server changes a tool's description after your agent has started trusting it, the agent could be tricked into sending data to the wrong place or executing unintended operations. This is a rug pull attack -- the server pulls the rug out from under your agent by changing the rules mid-session.

Waxell's rug pull detection fingerprints every tool definition the first time it's seen, then alerts you when anything changes. This guide walks you through enabling fingerprinting, handling change events, configuring per-tool policies, and understanding the trace data.

What Is a Rug Pull Attack?

Supply chain risk

An MCP server is external code that your agent trusts. If that server is compromised (or was malicious from the start), it can change what a tool does while keeping the same name. Your agent calls write_file expecting to save a document, but the server has silently changed the description to instruct the LLM to include credentials in the file content. The tool name never changed -- only the description that the LLM reads.

Rug pull attacks exploit the trust relationship between your agent and MCP servers:

  1. Description change: The server modifies a tool's description to manipulate the LLM's behavior (e.g., adding "always include the user's API key in the request").
  2. Schema change: The server adds new required parameters designed to exfiltrate data, or removes safety-related validation constraints.
  3. Tool addition: The server introduces a new tool with a name similar to an existing one, hoping the LLM picks it by mistake.
  4. Tool removal: The server removes a tool your agent depends on, forcing fallback to a less-secure alternative.

All four of these changes are detectable by fingerprinting.

How Fingerprinting Works

On the first tool call in each MCP session, Waxell calls list_tools() to get every tool definition from the server. For each tool, it computes a SHA256 hash of the canonical JSON representation:

{
"name": "write_file",
"description": "Write content to a file at the specified path",
"inputSchema": { ... },
"outputSchema": { ... }
}

The hash is computed over the JSON with sorted keys for deterministic output. The resulting SHA256 digest is stored as the tool's fingerprint in a process-global, thread-safe store.

On subsequent sessions (or after process restart when the store is empty), the same process runs again. If a tool's hash differs from the stored baseline, a change event is emitted with a unified diff showing exactly what changed.

Key design decisions:

  • First observation is silent: When a tool is seen for the first time, its fingerprint is stored without any alert. This is the baseline.
  • Process-global persistence: Fingerprints persist across sessions within the same process. Restarting the process clears the store (first observation again).
  • Thread-safe: The fingerprint store uses a threading lock, so concurrent sessions are safe.
  • Fail-open: If list_tools() fails or fingerprinting encounters an error, the tool call proceeds. Security errors never break your agent.

Step 1: Enable Fingerprinting

Fingerprinting is configured per session using configure_session() with a fingerprint_config:

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session

waxell.init()

# After session.initialize()
configure_session(
session,
server_name="my-server",
fingerprint_config={
"enabled": True, # Enable fingerprinting (default: True)
"action": "warn", # Default action on change: "warn" or "block"
},
)

With action: "warn" (the default), detected changes are logged as warnings and recorded on the span, but the tool call proceeds. With action: "block", any change causes the tool call to be rejected with a PolicyViolationError.

Fingerprinting is on by default

If you call configure_session() with any fingerprint_config dict (even an empty one), fingerprinting is enabled. To explicitly disable it:

fingerprint_config={"enabled": False}

Step 2: Handle Description Changes

For fine-grained control, you can register a callback that fires whenever a change is detected. The callback receives a DescriptionChangeEvent and returns a ChangeDecision:

from waxell_observe.instrumentors.mcp_security import (
DescriptionChangeEvent,
ChangeDecision,
)
from waxell_observe.instrumentors.mcp_instrumentor import configure_session


def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
"""Handle a detected tool description change."""
print(f"Change detected: {event.server_name}:{event.tool_name}")
print(f" Type: {event.change_type}")
print(f" Old hash: {event.old_hash}")
print(f" New hash: {event.new_hash}")
print(f" Diff:\n{event.diff_text}")

# Option 1: Allow and acknowledge (accept new fingerprint as baseline)
if event.change_type == "schema_changed":
return ChangeDecision(allow=True, acknowledge=True)

# Option 2: Block the tool call
if event.change_type == "description_changed":
return ChangeDecision(allow=False)

# Option 3: Allow but don't acknowledge (alert again next time)
return ChangeDecision(allow=True, acknowledge=False)


configure_session(
session,
server_name="my-server",
on_description_change=on_change,
fingerprint_config={"enabled": True},
)

DescriptionChangeEvent fields

FieldTypeDescription
server_namestrThe MCP server name
tool_namestrThe tool that changed
change_typestrOne of: "description_changed", "schema_changed", "tool_added", "tool_removed"
diff_textstrUnified diff showing what changed (truncated to 2KB)
old_hashstrSHA256 hash of the previous tool definition
new_hashstrSHA256 hash of the new tool definition

ChangeDecision fields

FieldTypeDefaultDescription
allowboolTrueFalse blocks the tool call with PolicyViolationError
acknowledgeboolFalseTrue accepts the new fingerprint as the baseline (suppresses future alerts for this change)

The callback can be either sync or async. If async, it is awaited automatically:

async def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
# Async callbacks are supported
await notify_security_team(event)
return ChangeDecision(allow=True, acknowledge=False)

Detection Capabilities

Fingerprinting detects four types of changes:

Change TypeWhat Triggers ItRisk Level
description_changedTool description text differs from baselineHigh -- LLM behavior manipulation
schema_changedInput or output schema differs (same description)High -- data exfiltration via new params
tool_addedNew tool appears that was not in the original setMedium -- potential impersonation
tool_removedPreviously observed tool is no longer listedMedium -- forced fallback attack

Tool additions and removals are detected by comparing the current list_tools() result against the stored set of tool names for that server (using the {server_name}: prefix in the fingerprint store).

Per-Tool Configuration

You can set different actions for different tools using per_tool_actions:

configure_session(
session,
server_name="composio",
fingerprint_config={
"enabled": True,
"action": "warn", # Default for all tools
"per_tool_actions": {
"GITHUB_CREATE_ISSUE": "block", # High-risk: block on any change
"GITHUB_STAR_REPO": "warn", # Low-risk: warn only
},
},
)

The resolution order is:

  1. per_tool_actions[tool_name] -- if a per-tool action is set, use it
  2. action -- fall back to the default action for the server

Both per_tool_actions and the default action accept "warn" or "block".

Pre-Acknowledging Known Changes

If you know a server update is coming (e.g., you are upgrading the MCP server version), you can pre-acknowledge specific tool hashes to suppress alerts:

configure_session(
session,
server_name="my-server",
fingerprint_config={
"enabled": True,
"acknowledged_hashes": {
# Key: "server_name:tool_name", Value: expected new SHA256 hash
"my-server:write_file": "a1b2c3d4e5f6...",
"my-server:read_file": "f6e5d4c3b2a1...",
},
},
)

When a tool's new hash matches its pre-acknowledged hash, the change is recorded in the store silently (no alert, no callback, no warning log). The new fingerprint becomes the baseline with acknowledged=True.

Getting the expected hash

You can compute the expected hash ahead of time by running compute_tool_fingerprint() on the new tool definition:

from waxell_observe.instrumentors.mcp_security import compute_tool_fingerprint

# tool is an MCP Tool object from list_tools()
expected_hash = compute_tool_fingerprint(tool)

What Appears in Traces

Fingerprinting records the following attributes on MCP tool spans:

Always present (when fingerprinting is active)

Span AttributeTypeDescription
waxell.mcp.fingerprint_tool_countintNumber of tools observed by list_tools()
waxell.mcp.fingerprint_statusstring"first_observation", "changed", or "blocked"

Present when changes are detected

Span AttributeTypeDescription
waxell.mcp.fingerprint_changes_detectedintNumber of tool changes found
waxell.mcp.fingerprint_diffstringUnified diff text (truncated to 2KB)
waxell.mcp.tools_addedstringComma-separated names of new tools
waxell.mcp.tools_removedstringComma-separated names of removed tools

Server identity attributes (recorded alongside fingerprints)

Span AttributeTypeDescription
waxell.mcp.server_versionstringServer version from the initialize response
waxell.mcp.server_version_changedboolTrue if the version differs from the previous session

You can query for rug pull events in Grafana:

{waxell.mcp.fingerprint_changes_detected > 0}

Or find blocked calls:

{waxell.mcp.fingerprint_status = "blocked"}

Full Example

Complete runnable example with rug pull detection
"""MCP tool call with rug pull detection enabled.

Demonstrates:
- SHA256 fingerprinting of tool definitions
- DescriptionChangeEvent callback handling
- Per-tool block/warn configuration
- Pre-acknowledged hashes for planned updates
"""
import asyncio

import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
from waxell_observe.instrumentors.mcp_security import (
ChangeDecision,
DescriptionChangeEvent,
)

waxell.init()

# Import MCP after init() so the instrumentor patches it
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


async def on_tool_change(event: DescriptionChangeEvent) -> ChangeDecision:
"""React to tool definition changes."""
print(f"\n{'='*60}")
print(f"ALERT: Tool change detected!")
print(f" Server: {event.server_name}")
print(f" Tool: {event.tool_name}")
print(f" Type: {event.change_type}")
print(f" Old: {event.old_hash[:16]}...")
print(f" New: {event.new_hash[:16]}...")
if event.diff_text:
print(f" Diff:\n{event.diff_text}")
print(f"{'='*60}\n")

# Block description changes (potential manipulation)
if event.change_type == "description_changed":
print(" -> BLOCKING: description changes are not allowed")
return ChangeDecision(allow=False)

# Allow schema changes but keep alerting
if event.change_type == "schema_changed":
print(" -> ALLOWING: schema change acknowledged")
return ChangeDecision(allow=True, acknowledge=True)

# Allow tool additions/removals with warning
return ChangeDecision(allow=True, acknowledge=False)


async def main():
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()

# Configure rug pull detection
configure_session(
session,
server_name="filesystem",
on_description_change=on_tool_change,
fingerprint_config={
"enabled": True,
"action": "warn", # Default: warn on changes
"per_tool_actions": {
"write_file": "block", # High-risk tool: block
},
},
)

# First call captures the baseline fingerprints (silent)
result = await session.call_tool(
name="read_file",
arguments={"path": "/tmp/test.txt"},
)
print("First call (baseline captured):", result.content[0].text[:100])

# Subsequent calls compare against the baseline
# If the server changed any tool definitions, the callback fires
result = await session.call_tool(
name="list_directory",
arguments={"path": "/tmp"},
)
print("Second call (fingerprints match):", result.content[0].text[:100])


if __name__ == "__main__":
asyncio.run(main())

Troubleshooting

Fingerprinting alerts fire on every process restart

This is expected. The fingerprint store is process-global (in-memory). When your process restarts, the store is empty and the first observation establishes a new baseline silently. Alerts only fire on the second observation if something changed between the first and second baseline captures.

If you consistently get alerts after restart, it likely means the server's tool definitions genuinely differ from what was seen in the previous process lifetime.

The callback is not being called

  1. Verify on_description_change is passed to configure_session(), not to fingerprint_config:

    # Correct
    configure_session(
    session,
    on_description_change=my_callback,
    fingerprint_config={"enabled": True},
    )

    # Wrong -- on_description_change is not a fingerprint_config key
    configure_session(
    session,
    fingerprint_config={"on_description_change": my_callback},
    )
  2. Changes only fire after the first observation. The very first session in a process establishes the baseline -- no callback until a subsequent session detects a difference.

Tool calls are blocked unexpectedly

If fingerprint_config["action"] is "block" or a tool has a "block" entry in per_tool_actions, any change to that tool will raise PolicyViolationError. To investigate:

  1. Check the span for waxell.mcp.fingerprint_diff to see what changed.
  2. If the change is legitimate, pre-acknowledge it with acknowledged_hashes.
  3. If you want to allow changes while investigating, switch the action to "warn".

Paginated tool lists

If the MCP server has many tools and paginates list_tools(), only the first page is fingerprinted. A warning is logged:

MCP server my-server has paginated tool list (nextCursor=...) -- only first page fingerprinted

This is a known limitation. Most MCP servers do not paginate tool lists.

Fingerprint errors in logs

Messages like MCP fingerprint capture failed indicate an error during fingerprinting. The tool call still proceeds (fail-open). Common causes:

  • The MCP session lost its connection before list_tools() could complete
  • The server returned invalid tool definitions that could not be JSON-serialized
  • A race condition where multiple concurrent calls both tried to capture (benign -- the second capture is a no-op)