Basic knowledgeApril 18, 20269 min read

What is MCP server security?

MCP servers extend what AI agents can do. That is also what makes them a security boundary. Understanding the threat surface is essential before you connect one to a model with real capabilities.

Why this matters

36.7% of MCP servers are vulnerable to server-side request forgery (SSRF), according to BlueRock Security research.
Tool poisoning lets a malicious server override the behavior of other tools in the same agent session.
MCP servers often run with the same permissions as the host process — credential exposure is a structural risk, not just a misconfiguration.
Defenders should audit MCP servers with the same rigor as third-party code dependencies.

What MCP is and why it matters for security

The Model Context Protocol (MCP) is an open standard that lets AI agents connect to external services, data sources, and tools through a common interface. An MCP server is the component that wraps a service — a database, a file system, an API — and exposes it to the agent through the protocol.

From a user's standpoint, MCP servers make agents dramatically more useful. From a security standpoint, they extend the agent's trust boundary into services and data that may contain sensitive information, have write access, or be able to trigger real-world actions. Every MCP server a user connects is a new surface that an attacker — or a misconfigured skill — can try to reach.

The four main threat categories

Security research on MCP servers in 2026 has converged on four threat categories that defenders should understand before evaluating a server.

SSRF (Server-Side Request Forgery): BlueRock Security found that 36.7% of audited MCP servers are vulnerable to SSRF, meaning an attacker can cause the server to make requests to internal systems the user never intended to expose. SSRF in an MCP server can pivot attacks to cloud metadata endpoints, internal APIs, and other services that are unreachable from the public internet.
Tool poisoning: A malicious MCP server can define tools with names that shadow or override legitimate tools in the same agent session. When the agent calls what it thinks is a trusted tool, the malicious server intercepts the call and executes different behavior.
Prompt injection via tool responses: If an MCP server returns content that contains injected instructions — either from a data source it reads or from the server itself — those instructions may alter the agent's behavior in subsequent steps. This is indirect prompt injection at the MCP layer.
Credential exposure: MCP servers often run inside the user's local environment and inherit the permissions of the host process. If the server is compromised or behaves maliciously, it can access environment variables, configuration files, API keys, and tokens that are available to the process.

How to evaluate an MCP server before connecting it

The evaluation questions for an MCP server are similar to those for any third-party code dependency, but the agentic context adds urgency: once connected, the server participates in the agent's decision loop.

Check the source. Is the server published by a verifiable author? Is the repository actively maintained with visible commits and issue responses?
Review what the server requests. Does it ask for credentials, environment variables, or file system access it does not need for its stated purpose?
Scan for SSRF patterns. Does the server make outbound requests? Are those requests scoped to specific domains, or can they be directed to arbitrary URLs?
Look for tool naming collisions. If you are using multiple MCP servers, do any of them define tool names that overlap with each other or with your agent's built-in tools?
Check for data exfiltration paths. Does the server send any data to external endpoints that are not documented or expected?
Apply the least-privilege principle. Run the server with the minimum permissions it needs, not the full permissions of your user account or host process.

The relationship between MCP security and skill security

OpenClaw skills and MCP servers occupy slightly different positions in the agent stack. Skills define agent behavior through instructions and capability declarations. MCP servers provide runtime tools and data access. In practice, a malicious skill can direct an agent to connect a malicious MCP server, and a malicious MCP server can execute instructions that alter skill behavior.

That is why serious security programs treat both layers. TrustSkills currently scans OpenClaw skill packages from ClawHub. MCP server scanning is on the roadmap. Sign up for the waitlist to be notified when it launches.

Trusted sources

Cisco

Introducing the AI Agent Security Scanner for IDEs

Open source

Source for the four threat category framework and the defense-in-depth model for MCP security.

Snyk

GitHub — snyk/agent-scan: Security scanner for AI agents, MCP servers and agent skills

Open source

Source for the agent inventory approach and the CRITICAL-level detector recall data.

OWASP

OWASP Agentic Skills Top 10

Open source

Framework for categorizing agentic AI risks including tool poisoning and prompt injection at the MCP layer.

What is MCP server security?

Why this matters

What MCP is and why it matters for security

The four main threat categories

How to evaluate an MCP server before connecting it

The relationship between MCP security and skill security

Trusted sources

Continue reading

EU AI Act and AI agent skills: what organizations need to know before August 2026

Data exfiltration in AI agent skills: how attackers steal credentials through ClawHub

How to scan an AI agent skill for malware before you install it