Model context protocol (MCP) risks: key takeaways from CoSAI security white paper

MCP Security + Review admin January 27, 2026

If you are building or securing AI agents, you are likely already using the Model Context Protocol (MCP). If you aren’t, you will soon. Often described as a “USB-C port for AI applications,” MCP is rapidly becoming the standard plumbing that connects LLMs to your local files, databases and third-party APIs.

It solves a massive headache for developers: stopping the need to write custom integrations for every data source an agent must access. But as a new white paper from CoSAI (Coalition for Secure AI) details, that standardization introduces a distinct class of security risks most organizations are not yet handling.

Adversa AI contributed to the Model Context Protocol Security White Paper released by CoSAI. It’s a dense, necessary read. Here’s why this paper matters now, the specific threats it uncovers, and what security leaders need to do.

MCP security white paper from Coalition for Secure AI. The development co-led by Adversa

MCP security: the time is now

With MCP, we are giving AI agents both tools (APIs) and permission to execute code. We have already seen incidents such as Asana AI data contamination, where the intersection of agentic permissions and protocol flaws caused real damage. As MCP becomes the default integration layer for tools like Claude Desktop, Cursor and enterprise agent platforms, the attack surface is ballooning. If you aren’t auditing your MCP implementations now, it’s akin to connecting unverified USB drives to your enterprise network.

API security isn’t enough

Security teams might see MCP and think, “It’s just JSON-RPC over HTTP. Yet another API to protect”. The white paper argues that view is dangerous. The novelty is the decision engine: in traditional systems, component A talks to component B based on deterministic logic. In MCP architectures, an LLM is the router.

An LLM is probabilistic, manipulable and prone to hallucinations that affect tool invocation. Existing controls like firewalls or static RBAC fail because they cannot inspect the semantic intent or validity of the conversation that triggered the API call. MCP places a non-deterministic actor at the center of security-critical decisions, requiring a threat model that blends application security with behavioral AI safety.

MCP-specific threats: what you missed

The paper categorizes nearly 40 threats; seven are particularly insidious and unfamiliar to security teams.

Identity spoofing. Weak or misconfigured authentication could let attackers impersonate legitimate clients or agents, corrupting audit trails or gaining unauthorized access to server resources.
Tool poisoning. Malicious modification of tool metadata, configuration or descriptors can cause agents to invoke compromised tools, leading to data leaks or system compromise.
Full schema poisoning (FSP). Attackers compromise entire tool schema definitions at the structural level, injecting hidden parameters, altered return types or malicious default values that affect all subsequent tool invocations while appearing legitimate to monitoring systems.
Resource content poisoning. Attackers embed hidden malicious instructions within data sources that MCP servers retrieve and provide to LLMs, causing poisoned content to be executed as commands when processed.
Typosquatting and confusion attacks. Malicious actors create MCP servers or tools with names or descriptions similar to legitimate ones, tricking clients or agents into invoking harmful tools due to naming confusion or model hallucination.
Shadow MCP servers. Unauthorized, unmonitored or hidden MCP server instances create blind spots, increasing the risk of undetected compromise and covert data exfiltration. These servers pose governance and compliance risks.
Overreliance on the LLM. MCP server developers may implement overly permissive tools assuming the LLM will invoke them correctly. Model-level controls are not ironclad — models can be manipulated via prompt injection, make judgment errors, or be swapped for weaker models.

Priorities for defenders

Based on the CoSAI findings, several approaches to MCP security emerge as priorities.

End-to-end agent identity and traceability are essential.
MCP servers/agents must operate with least privilege using fine‑grained authorization.
Input/data sanitization and strict allowlists are mandatory at each trust boundary. Treat all LLM/MCP outputs as untrusted and apply prompt‑injection detection and strict schema validation.
Sandboxing/isolation required for MCP servers and any execution of LLM‑generated code. Containers alone are insufficient as a security boundary.
Hardware and software integrity: use TEEs, confidential containers, and remote attestation to protect MCP components and credentials; complement TEEs with runtime controls and sandboxing.
Cryptographic provenance: require signed artifacts, SBOMs, TLS for transport, end‑to‑end signatures, and verification of packages and signing keys before deployment.
Transport & protocol protections: enforce payload limits, mutual authentication, TLS protection, and other measures to prevent DoS, impersonation, tampering, and hijacking.
Tool and UX design: tools must have single, limited purpose and not delegate security decisions to LLMs. Present clear, non‑fatiguing security prompts to users.
Human‑in‑the‑loop (HITL): enforce confirmation for sensitive actions (and prevent unprivileged users from disabling confirmations) or implement server‑side elicitation to require user approval.
Logging, observability and auditability: centralize logs across MCP host/client/server, record tool decisions, parameters and prompts, use gateways/proxies for centralized functionality.
Enforce lifecycle & supply‑chain controls, including mandatory code‑signing verification, SCA, allow‑lists of approved MCP servers, dependency pinning with hashes, and SSDLC practices.

It’s important to remember that static security controls are insufficient against the dynamic, probabilistic behavior of MCP-connected agents. To secure these systems, move beyond defensive design and adopt offensive verification — you cannot know whether your agent allows resource poisoning or schema manipulation until you actively try to break it.

Battle-test your agents: Don’t wait for an incident. Use Adversa’s agentic AI red teaming platform to rigorously stress-test your systems. Adversa lets you simulate the complex attack vectors described in the paper — from indirect injections to tool manipulation — helping harden agents before deployment and during live operation.