MCP Server Security: What You Need to Know

The worst case: prompt injection tricks your agent into handing over its own credentials. Attackers bypass the AI entirely and access your systems with the agent's full authority.

4 min read
MCP Server Security: What You Need to Know
Photo by Obi / Unsplash

Anthropic's Model Context Protocol (MCP) gives AI agents a standardized way to connect with APIs and external systems. The appeal is clear: faster integration, more capable agents, fewer integration headaches. But MCP servers introduce three categories of security problems that need your attention now.

First, most available MCP servers weren't built for production environments. Second, these servers often get permissions far beyond what they need. Third, they create new ways for attackers to compromise your systems through the AI agent itself.

The path forward requires three things: strict access controls that limit what agents can do, human verification for risky actions, and careful vetting of where you source MCP servers. This isn't bleeding-edge technology that gets special treatment—it's software that touches your data, and it needs the same security scrutiny as everything else.

How MCP Works

MCP standardizes how AI agents interact with APIs. Instead of developers writing custom code to interpret API documentation and make calls, the protocol presents API descriptions and arguments in a format that AI agents—already trained on programming patterns—can understand and use directly.

As Box CTO Ben puts it, "MCP is a great way to standardize, the way that you integrate different systems together." Think of it as giving agents a common language for talking to different tools and systems.

Three Security Problems

Most MCP Servers Aren't Production-Ready

The protocol's rapid adoption means MCP servers come from everywhere: individual developers, small startups, research projects. That variety brings innovation, but many implementations skip basics:

Authentication gaps. Some servers use weak authentication. Others skip it entirely.

Poor infrastructure. Servers get deployed on systems configured without basic security hardening, creating openings for injection attacks and standard exploits.

Malicious servers. Attackers create fake MCP servers designed to look legitimate. When an agent connects, these servers can execute arbitrary code and exfiltrate data. As Ben notes, "some attackers would go out of their way to expose fake MCP servers who would actually go and steal your data."

Permission Creep

MCP servers often grant agents far more access than their actual function requires. An agent built to notify you about urgent emails might receive:

  • Access to your entire email archive, calendar, and file storage
  • Ability to send messages to anyone, not just you

Ben compares this to an app demanding credentials to your entire computer just to complete a simple task. That "triggers this sense of unease, which is a very appropriate, especially when you're thinking about data security."

New Attack Vectors Through Agents

MCP amplifies the damage from attacks targeting the AI agent itself. The vulnerability exists in the agent, but the MCP server's capabilities determine how much damage an attacker can do.

Prompt injection and data poisoning. An attacker embeds malicious instructions in content the agent processes—an email, document, or chat message. The poisoned input tricks the agent into executing the attacker's commands.

Data theft. A compromised agent can query internal systems for sensitive information—customer records from your CRM, employee data from HRIS—and send it directly to the attacker.

Credential exfiltration. In the worst case, the agent can be instructed to hand over its own authentication tokens. The attacker then bypasses the agent completely and accesses your systems directly. As Ben explains, the agent "can actually send the attacker its credentials like a token or some other authorization. So then that person will just use that sort of compromised authentication to go execute it directly."

What to Do

Limit Access Strictly

Grant each agent only the specific capabilities it needs for its designated task. Nothing more.

Configure MCP servers to expose minimal functionality. If your platform allows granular controls, use them. As Ben says: "don't expose the MCP server capabilities unless you want your MCP client to be able to do those things."

Add Human Verification

For sensitive operations—deleting data, sending communications, executing code—require explicit human approval before the action completes. The agent should recognize these situations and prompt the user for confirmation. This prevents a compromised agent from carrying out malicious instructions autonomously.

Vet Your Sources

Source MCP servers from organizations with demonstrated security competence. Look for:

  • Track record in securing production systems
  • Active maintenance and vulnerability management
  • Security practices that evolve with new threats

Security isn't static. You need a provider committed to adapting their protections over time. Ben emphasizes choosing an organization that "has the ability to make sure that's secure. Not just now, but but continue to evolve the security over time."

Bottom Line

MCP offers real advantages for AI integration. But as Ben summarizes: "The TLDR is just because it's new and interesting and powerful software doesn't mean it's not software to access your data. So you need to treat it accordingly so that you understand it and use it appropriately."

New technology, standard security requirements.