Rug Pull Detection for MCP Tools
MCP servers declare tool names, descriptions, and schemas that your agent relies on to decide which tools to call and how. If a server changes a tool's description after your agent has started trusting it, the agent could be tricked into sending data to the wrong place or executing unintended operations. This is a rug pull attack -- the server pulls the rug out from under your agent by changing the rules mid-session.
Waxell's rug pull detection fingerprints every tool definition the first time it's seen, then alerts you when anything changes. This guide walks you through enabling fingerprinting, handling change events, configuring per-tool policies, and understanding the trace data.
What Is a Rug Pull Attack?
An MCP server is external code that your agent trusts. If that server is compromised (or was malicious from the start), it can change what a tool does while keeping the same name. Your agent calls write_file expecting to save a document, but the server has silently changed the description to instruct the LLM to include credentials in the file content. The tool name never changed -- only the description that the LLM reads.
Rug pull attacks exploit the trust relationship between your agent and MCP servers:
- Description change: The server modifies a tool's description to manipulate the LLM's behavior (e.g., adding "always include the user's API key in the request").
- Schema change: The server adds new required parameters designed to exfiltrate data, or removes safety-related validation constraints.
- Tool addition: The server introduces a new tool with a name similar to an existing one, hoping the LLM picks it by mistake.
- Tool removal: The server removes a tool your agent depends on, forcing fallback to a less-secure alternative.
All four of these changes are detectable by fingerprinting.
How Fingerprinting Works
On the first tool call in each MCP session, Waxell calls list_tools() to get every tool definition from the server. For each tool, it computes a SHA256 hash of the canonical JSON representation:
{
"name": "write_file",
"description": "Write content to a file at the specified path",
"inputSchema": { ... },
"outputSchema": { ... }
}
The hash is computed over the JSON with sorted keys for deterministic output. The resulting SHA256 digest is stored as the tool's fingerprint in a process-global, thread-safe store.
On subsequent sessions (or after process restart when the store is empty), the same process runs again. If a tool's hash differs from the stored baseline, a change event is emitted with a unified diff showing exactly what changed.
Key design decisions:
- First observation is silent: When a tool is seen for the first time, its fingerprint is stored without any alert. This is the baseline.
- Process-global persistence: Fingerprints persist across sessions within the same process. Restarting the process clears the store (first observation again).
- Thread-safe: The fingerprint store uses a threading lock, so concurrent sessions are safe.
- Fail-open: If
list_tools()fails or fingerprinting encounters an error, the tool call proceeds. Security errors never break your agent.
Step 1: Enable Fingerprinting
Fingerprinting is configured per session using configure_session() with a fingerprint_config:
import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
waxell.init()
# After session.initialize()
configure_session(
session,
server_name="my-server",
fingerprint_config={
"enabled": True, # Enable fingerprinting (default: True)
"action": "warn", # Default action on change: "warn" or "block"
},
)
With action: "warn" (the default), detected changes are logged as warnings and recorded on the span, but the tool call proceeds. With action: "block", any change causes the tool call to be rejected with a PolicyViolationError.
If you call configure_session() with any fingerprint_config dict (even an empty one), fingerprinting is enabled. To explicitly disable it:
fingerprint_config={"enabled": False}
Step 2: Handle Description Changes
For fine-grained control, you can register a callback that fires whenever a change is detected. The callback receives a DescriptionChangeEvent and returns a ChangeDecision:
from waxell_observe.instrumentors.mcp_security import (
DescriptionChangeEvent,
ChangeDecision,
)
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
"""Handle a detected tool description change."""
print(f"Change detected: {event.server_name}:{event.tool_name}")
print(f" Type: {event.change_type}")
print(f" Old hash: {event.old_hash}")
print(f" New hash: {event.new_hash}")
print(f" Diff:\n{event.diff_text}")
# Option 1: Allow and acknowledge (accept new fingerprint as baseline)
if event.change_type == "schema_changed":
return ChangeDecision(allow=True, acknowledge=True)
# Option 2: Block the tool call
if event.change_type == "description_changed":
return ChangeDecision(allow=False)
# Option 3: Allow but don't acknowledge (alert again next time)
return ChangeDecision(allow=True, acknowledge=False)
configure_session(
session,
server_name="my-server",
on_description_change=on_change,
fingerprint_config={"enabled": True},
)
DescriptionChangeEvent fields
| Field | Type | Description |
|---|---|---|
server_name | str | The MCP server name |
tool_name | str | The tool that changed |
change_type | str | One of: "description_changed", "schema_changed", "tool_added", "tool_removed" |
diff_text | str | Unified diff showing what changed (truncated to 2KB) |
old_hash | str | SHA256 hash of the previous tool definition |
new_hash | str | SHA256 hash of the new tool definition |
ChangeDecision fields
| Field | Type | Default | Description |
|---|---|---|---|
allow | bool | True | False blocks the tool call with PolicyViolationError |
acknowledge | bool | False | True accepts the new fingerprint as the baseline (suppresses future alerts for this change) |
The callback can be either sync or async. If async, it is awaited automatically:
async def on_change(event: DescriptionChangeEvent) -> ChangeDecision:
# Async callbacks are supported
await notify_security_team(event)
return ChangeDecision(allow=True, acknowledge=False)
Detection Capabilities
Fingerprinting detects four types of changes:
| Change Type | What Triggers It | Risk Level |
|---|---|---|
description_changed | Tool description text differs from baseline | High -- LLM behavior manipulation |
schema_changed | Input or output schema differs (same description) | High -- data exfiltration via new params |
tool_added | New tool appears that was not in the original set | Medium -- potential impersonation |
tool_removed | Previously observed tool is no longer listed | Medium -- forced fallback attack |
Tool additions and removals are detected by comparing the current list_tools() result against the stored set of tool names for that server (using the {server_name}: prefix in the fingerprint store).
Per-Tool Configuration
You can set different actions for different tools using per_tool_actions:
configure_session(
session,
server_name="composio",
fingerprint_config={
"enabled": True,
"action": "warn", # Default for all tools
"per_tool_actions": {
"GITHUB_CREATE_ISSUE": "block", # High-risk: block on any change
"GITHUB_STAR_REPO": "warn", # Low-risk: warn only
},
},
)
The resolution order is:
per_tool_actions[tool_name]-- if a per-tool action is set, use itaction-- fall back to the default action for the server
Both per_tool_actions and the default action accept "warn" or "block".
Pre-Acknowledging Known Changes
If you know a server update is coming (e.g., you are upgrading the MCP server version), you can pre-acknowledge specific tool hashes to suppress alerts:
configure_session(
session,
server_name="my-server",
fingerprint_config={
"enabled": True,
"acknowledged_hashes": {
# Key: "server_name:tool_name", Value: expected new SHA256 hash
"my-server:write_file": "a1b2c3d4e5f6...",
"my-server:read_file": "f6e5d4c3b2a1...",
},
},
)
When a tool's new hash matches its pre-acknowledged hash, the change is recorded in the store silently (no alert, no callback, no warning log). The new fingerprint becomes the baseline with acknowledged=True.
You can compute the expected hash ahead of time by running compute_tool_fingerprint() on the new tool definition:
from waxell_observe.instrumentors.mcp_security import compute_tool_fingerprint
# tool is an MCP Tool object from list_tools()
expected_hash = compute_tool_fingerprint(tool)
What Appears in Traces
Fingerprinting records the following attributes on MCP tool spans:
Always present (when fingerprinting is active)
| Span Attribute | Type | Description |
|---|---|---|
waxell.mcp.fingerprint_tool_count | int | Number of tools observed by list_tools() |
waxell.mcp.fingerprint_status | string | "first_observation", "changed", or "blocked" |
Present when changes are detected
| Span Attribute | Type | Description |
|---|---|---|
waxell.mcp.fingerprint_changes_detected | int | Number of tool changes found |
waxell.mcp.fingerprint_diff | string | Unified diff text (truncated to 2KB) |
waxell.mcp.tools_added | string | Comma-separated names of new tools |
waxell.mcp.tools_removed | string | Comma-separated names of removed tools |
Server identity attributes (recorded alongside fingerprints)
| Span Attribute | Type | Description |
|---|---|---|
waxell.mcp.server_version | string | Server version from the initialize response |
waxell.mcp.server_version_changed | bool | True if the version differs from the previous session |
You can query for rug pull events in Grafana:
{waxell.mcp.fingerprint_changes_detected > 0}
Or find blocked calls:
{waxell.mcp.fingerprint_status = "blocked"}
Full Example
Complete runnable example with rug pull detection
"""MCP tool call with rug pull detection enabled.
Demonstrates:
- SHA256 fingerprinting of tool definitions
- DescriptionChangeEvent callback handling
- Per-tool block/warn configuration
- Pre-acknowledged hashes for planned updates
"""
import asyncio
import waxell_observe as waxell
from waxell_observe.instrumentors.mcp_instrumentor import configure_session
from waxell_observe.instrumentors.mcp_security import (
ChangeDecision,
DescriptionChangeEvent,
)
waxell.init()
# Import MCP after init() so the instrumentor patches it
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def on_tool_change(event: DescriptionChangeEvent) -> ChangeDecision:
"""React to tool definition changes."""
print(f"\n{'='*60}")
print(f"ALERT: Tool change detected!")
print(f" Server: {event.server_name}")
print(f" Tool: {event.tool_name}")
print(f" Type: {event.change_type}")
print(f" Old: {event.old_hash[:16]}...")
print(f" New: {event.new_hash[:16]}...")
if event.diff_text:
print(f" Diff:\n{event.diff_text}")
print(f"{'='*60}\n")
# Block description changes (potential manipulation)
if event.change_type == "description_changed":
print(" -> BLOCKING: description changes are not allowed")
return ChangeDecision(allow=False)
# Allow schema changes but keep alerting
if event.change_type == "schema_changed":
print(" -> ALLOWING: schema change acknowledged")
return ChangeDecision(allow=True, acknowledge=True)
# Allow tool additions/removals with warning
return ChangeDecision(allow=True, acknowledge=False)
async def main():
server_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Configure rug pull detection
configure_session(
session,
server_name="filesystem",
on_description_change=on_tool_change,
fingerprint_config={
"enabled": True,
"action": "warn", # Default: warn on changes
"per_tool_actions": {
"write_file": "block", # High-risk tool: block
},
},
)
# First call captures the baseline fingerprints (silent)
result = await session.call_tool(
name="read_file",
arguments={"path": "/tmp/test.txt"},
)
print("First call (baseline captured):", result.content[0].text[:100])
# Subsequent calls compare against the baseline
# If the server changed any tool definitions, the callback fires
result = await session.call_tool(
name="list_directory",
arguments={"path": "/tmp"},
)
print("Second call (fingerprints match):", result.content[0].text[:100])
if __name__ == "__main__":
asyncio.run(main())
Troubleshooting
Fingerprinting alerts fire on every process restart
This is expected. The fingerprint store is process-global (in-memory). When your process restarts, the store is empty and the first observation establishes a new baseline silently. Alerts only fire on the second observation if something changed between the first and second baseline captures.
If you consistently get alerts after restart, it likely means the server's tool definitions genuinely differ from what was seen in the previous process lifetime.
The callback is not being called
-
Verify
on_description_changeis passed toconfigure_session(), not tofingerprint_config:# Correct
configure_session(
session,
on_description_change=my_callback,
fingerprint_config={"enabled": True},
)
# Wrong -- on_description_change is not a fingerprint_config key
configure_session(
session,
fingerprint_config={"on_description_change": my_callback},
) -
Changes only fire after the first observation. The very first session in a process establishes the baseline -- no callback until a subsequent session detects a difference.
Tool calls are blocked unexpectedly
If fingerprint_config["action"] is "block" or a tool has a "block" entry in per_tool_actions, any change to that tool will raise PolicyViolationError. To investigate:
- Check the span for
waxell.mcp.fingerprint_diffto see what changed. - If the change is legitimate, pre-acknowledge it with
acknowledged_hashes. - If you want to allow changes while investigating, switch the action to
"warn".
Paginated tool lists
If the MCP server has many tools and paginates list_tools(), only the first page is fingerprinted. A warning is logged:
MCP server my-server has paginated tool list (nextCursor=...) -- only first page fingerprinted
This is a known limitation. Most MCP servers do not paginate tool lists.
Fingerprint errors in logs
Messages like MCP fingerprint capture failed indicate an error during fingerprinting. The tool call still proceeds (fail-open). Common causes:
- The MCP session lost its connection before
list_tools()could complete - The server returned invalid tool definitions that could not be JSON-serialized
- A race condition where multiple concurrent calls both tried to capture (benign -- the second capture is a no-op)