Researchers Reveal How MCP Prompt Injection Can Be Weaponized for Both Cyberattacks and Defense

jinia
By -


 As artificial intelligence (AI) technologies advance rapidly, cybersecurity researchers are uncovering new vulnerabilities that could both threaten and fortify AI-driven systems. A new report from Tenable sheds light on how Model Context Protocol (MCP) prompt injection attacks can be leveraged not only for malicious purposes but also to enhance cybersecurity tools and defenses.


What Is MCP and Why It Matters

Introduced by Anthropic in November 2024, MCP is an open framework designed to link Large Language Models (LLMs) like Claude with external tools and services. By following a client-server architecture, MCP empowers LLMs to query, access, and interact with data sources via model-controlled tools—boosting their performance, relevance, and real-world utility.


Tools built on MCP can be accessed through clients like Claude Desktop or Cursor, communicating with multiple MCP servers that expose different functionalities. This powerful interconnectivity, however, also opens up new threat vectors.


The Dark Side: Prompt Injection, Tool Poisoning, and Rug Pull Attacks

While MCP offers a streamlined method for integrating tools and switching between LLM providers, it introduces several AI-specific vulnerabilities, including:


Prompt Injection Attacks: Malicious actors can craft hidden instructions within user prompts. For instance, if an MCP tool is integrated with Gmail, attackers could send covert messages that instruct the LLM to forward emails without user consent.

Tool Poisoning: Malicious code can be embedded within tool descriptions, influencing how LLMs execute tasks.
Rug Pull Attacks: A tool may behave normally during initial use but later changes behavior via a time-delayed update, enabling stealthy data theft or system compromise.


Additionally, cross-tool contamination and server shadowing allow one MCP server to override or interfere with another, silently redirecting tasks and exposing systems to unauthorized access.


MCP Prompt Injection: A Double-Edged Sword

Despite these risks, researchers also demonstrated how MCP prompt injection techniques can be harnessed for good. A proof-of-concept tool logs all MCP tool function calls by embedding a specially crafted description that instructs the LLM to insert it before executing any other tool.


This technique logs critical information such as:

1. MCP server and tool names
2. Tool descriptions
3. User prompts that triggered tool execution


In another use case, prompt injection can transform a tool into an AI firewall that blocks unauthorized tools from running—effectively acting as a gatekeeper inside the MCP framework.

 

"Tools should require explicit approval before running in most MCP host applications," said Ben Smith, a cybersecurity researcher.

 

"But because LLMs are non-deterministic, these techniques rely on clever manipulation of tool descriptions and return values to enforce rules or deceive models."


A2A Protocol Vulnerability: A New Attack Surface Emerges

Alongside MCP flaws, researchers at Trustwave SpiderLabs disclosed weaknesses in the Agent2Agent (A2A) Protocol, introduced by Google in April 2025. A2A enables secure and seamless communication between AI agents—across vendors and platforms.


However, attackers could exploit this feature by masquerading as a high-capability agent. By exaggerating attributes on its Agent Card, a rogue AI agent could:

1. Receive all task assignments
2. Access and parse sensitive user data
3. Return manipulated results, tricking downstream LLMs and users

 

“If you control one node and manipulate its credentials, you can hijack the communication and turn the agent into a data siphon,” warned Tom Neaves, a security expert.


Conclusion: The Urgent Need for Secure AI Protocols

The latest findings emphasize that AI protocol vulnerabilities like MCP and A2A are not just technical oversights—they're critical attack vectors that can be abused or harnessed, depending on the intent. As AI integration becomes increasingly widespread in productivity tools, security platforms, and communication apps, it’s essential to:


1. Enforce explicit tool permissions
2. Monitor tool behaviors over time
3. Implement robust LLM sandboxing techniques
4.Develop zero-trust mechanisms for inter-agent communication


These revelations serve as a wake-up call for developers, researchers, and businesses leveraging AI at scale: security must evolve as fast as the intelligence it protects.