Claude's Extension Ecosystem Has a Confused Deputy Problem

A Zero-Click Takeover, No Assembly Required

Researchers at Koi Security disclosed a vulnerability this week in Anthropic's Claude Chrome Extension that should make every enterprise security team pause. The attack chain combined two flaws: an overly permissive origin allowlist that accepted messages from any *.claude.ai subdomain, and a DOM-based XSS vulnerability in an Arkose Labs CAPTCHA component hosted on a-cdn.claude.ai.

The result: any website could silently inject prompts into Claude as if the user typed them. Zero clicks. Zero permission prompts. The victim sees nothing.

"No clicks, no permission prompts. Just visit a page, and an attacker completely controls your browser," wrote Oren Yomtov, the Koi Security researcher who discovered the flaw. The potential impact included stealing access tokens, accessing conversation history, and impersonating the victim to send emails or request confidential data.

Anthropic patched the extension to enforce strict origin matching, and Arkose Labs fixed the XSS vulnerability on February 19, 2026. Responsible disclosure started on December 27, 2025. That is a reasonable timeline, and the fix was appropriate.

But this is not a story about one vulnerability. It is about a pattern.

Four Teams, Four Critical Flaws, Three Months

Between December 2025 and March 2026, four independent security research teams published findings on critical vulnerabilities in Claude's extension ecosystem. Each team found a different flaw. Each attack vector was distinct. But the underlying architecture that made them possible was identical.

Koi Security (December 2025): Chrome Extension XSS Chain. The zero-click prompt injection described above. Attacker embeds a vulnerable Arkose Labs component in a hidden iframe, sends an XSS payload via postMessage, and the injected script fires prompts directly to the extension.

Zenity Labs (January 2026): Persistent Auth Exploitation. Researchers demonstrated that the Claude Chrome Extension maintains persistent authentication to enterprise tools like Google Drive, Slack, and internal systems. Through indirect prompt injection via malicious web content, Claude could be turned into what they called "XSS-as-a-service," executing JavaScript and performing actions under the user's identity.

LayerX (February 2026): Calendar Event RCE. This one is the most alarming. Researchers found that a single malicious Google Calendar event could trigger arbitrary code execution through Claude's Desktop Extensions (DXT). The attack exploited how Claude autonomously chains MCP connectors: data flows freely from low-risk sources like Google Calendar into high-privilege local executors without consent checks. CVSS score: 10.0. Affected users: 10,000+. Affected extensions: 50 in Anthropic's marketplace.

Koi Security (March 2026): Desktop Extension Command Injection. The same team that found the Chrome flaw had previously discovered that three official Anthropic Desktop extensions (Chrome, iMessage, and Apple Notes connectors) passed user input directly into AppleScript commands without sanitization. CVSS score: 8.9. Affected downloads: 350,000+. Claude Desktop extensions run fully unsandboxed, with full system permissions.

Four teams. Four critical findings. Three months. This is not a coincidence; it is a signal.

The Confused Deputy Problem

In computer security, a "confused deputy" is a program that is tricked into misusing its authority by a less-privileged entity. The classic example is a compiler service with write access to system directories that gets tricked into overwriting a critical file.

Claude's extension architecture is a confused deputy by design.

The extensions consume untrusted content: web pages, calendar events, email, documents. They also hold privileged access: browser control, file system access, enterprise application credentials, shell command execution. The model sits between these two worlds, interpreting instructions from both the user and the content it reads. When a malicious instruction arrives embedded in a web page or calendar event, the model cannot reliably distinguish it from a legitimate user request. It processes the instruction and executes it with all the privileges the extension provides.

This is not a bug in any specific extension. It is the fundamental architecture of every AI agent that reads untrusted input and takes actions on the user's behalf. The prompt injection problem is well-documented, but the extension ecosystem amplifies it by orders of magnitude. Each extension adds new input surfaces (web pages, calendars, messages) and new action capabilities (file access, code execution, API calls). The attack surface grows multiplicatively.

I wrote about a related pattern when Claude gained desktop control: giving an AI model the ability to control your computer while it processes untrusted content creates an architectural conflict that no amount of prompt engineering can resolve. The extension ecosystem takes that same conflict and distributes it across dozens of third-party components, each with its own trust assumptions and privilege levels.

Anthropic's Inconsistent Response

Here is where the pattern gets interesting. Anthropic's responses to these four findings were not uniform.

The Chrome Extension XSS chain (Koi Security): Fixed. Strict origin matching deployed.

The Desktop Extension command injection (Koi Security, CVSS 8.9): Fixed. Patched in version 0.1.9, August 2025.

The Calendar Event RCE via DXT/MCP (LayerX, CVSS 10.0): Declined to fix. Anthropic stated the flaw "falls outside our current threat model."

The persistent auth exploitation (Zenity Labs): Partial mitigation. Policy-based "Ask before acting" option added.

Anthropic fixed a CVSS 8.9 vulnerability but declined to fix a CVSS 10.0 vulnerability. The distinction appears to be surface-level: the 8.9 was a straightforward code fix (sanitize inputs), while the 10.0 is architectural (MCP connectors chain without trust boundaries). Fixing it would require rethinking how extensions interact, how data flows between connectors, and whether an AI model should be allowed to autonomously route information from low-trust to high-trust contexts.

That is a harder problem. But "harder" is not the same as "outside our threat model."

The MCP protocol is becoming the industry standard for AI tool integration. OpenAI, Google, and others are adopting it. If the protocol itself lacks trust boundary enforcement between connectors, every MCP implementation will inherit these same architectural vulnerabilities. Anthropic is not just declining to fix a bug in their product; they are signaling that the entire agent extension model's most fundamental security flaw is someone else's problem.

The Supply Chain Angle

There is another dimension worth examining. The Chrome extension vulnerability existed because a third-party CAPTCHA component from Arkose Labs was hosted on an Anthropic subdomain (a-cdn.claude.ai). Anthropic's own origin allowlist trusted all *.claude.ai subdomains. So a vulnerability in a vendor's code, hosted on Anthropic's infrastructure, became a direct attack vector against every Claude Chrome Extension user.

This is a textbook supply chain security failure. Anthropic's security perimeter was only as strong as every third-party component running on their subdomains. The fix (restricting the allowlist to exact domain matches) was correct, but it raises the question: how many other third-party components run on claude.ai subdomains? What is the vendor security review process for code that gets hosted inside Anthropic's trust boundary?

What Enterprises Should Do

If your organization has employees using Claude extensions, here is what matters now.

Inventory your exposure. Most security teams have no visibility into which employees are running AI extensions, or which extensions those are. The Claude Chrome Extension maintains persistent authentication to whatever enterprise tools the user has connected. That means a single compromised extension session could cascade into lateral movement across Google Drive, Slack, email, and internal applications. This is exactly the kind of shadow AI risk that traditional security controls miss.

Treat AI agents as privileged identities. An extension that can navigate your browser, read your credentials, and send emails on your behalf is not a productivity tool; it is an autonomous agent that can become an insider threat operating with the user's full identity. Apply the same controls you would to a service account: least privilege, session limits, activity monitoring, and anomaly detection. The identity crisis in AI agents is no longer theoretical.

Monitor the MCP ecosystem. If you are evaluating or deploying MCP-based tool integrations, understand that trust boundary enforcement between connectors is currently the responsibility of each individual implementation. The protocol itself does not prevent a low-trust data source from feeding instructions to a high-privilege executor. Until that changes, every MCP deployment should be assessed as a potential lateral movement vector.

Do not assume vendor patches are sufficient. Anthropic fixed three of the four reported vulnerabilities. The fourth, and most severe, was declined because it requires an architectural change. Your security posture cannot depend on a vendor's willingness to address fundamental design issues. Defense in depth means assuming the extension will be compromised and building detection and containment around that assumption.

The Bigger Picture

The Claude extension ecosystem is a preview of what happens when AI agents become the primary interface between users and their digital environments. Every agent that reads untrusted content and takes privileged actions will face the confused deputy problem. Every extension marketplace will become an attack surface multiplier. Every MCP connector that chains low-trust inputs to high-trust executors will create exploitation opportunities.

Anthropic is not uniquely at fault here. They are just the first to ship a mature enough extension ecosystem for security researchers to thoroughly probe. OpenAI, Google, and every other company building agent platforms will face identical findings as their ecosystems mature.

The question is whether the industry will treat these as individual bugs to patch, or as evidence that the entire agent-extension architecture needs security-first redesign. Based on the first three months of 2026, the industry is choosing patches. History suggests that is not going to be sufficient.

A Zero-Click Takeover, No Assembly Required

The result: any website could silently inject prompts into Claude as if the user typed them. Zero clicks. Zero permission prompts. The victim sees nothing.

But this is not a story about one vulnerability. It is about a pattern.

Four Teams, Four Critical Flaws, Three Months

Four teams. Four critical findings. Three months. This is not a coincidence; it is a signal.

The Confused Deputy Problem

Claude's extension architecture is a confused deputy by design.

Anthropic's Inconsistent Response

Here is where the pattern gets interesting. Anthropic's responses to these four findings were not uniform.

The Chrome Extension XSS chain (Koi Security): Fixed. Strict origin matching deployed.

The Desktop Extension command injection (Koi Security, CVSS 8.9): Fixed. Patched in version 0.1.9, August 2025.

The Calendar Event RCE via DXT/MCP (LayerX, CVSS 10.0): Declined to fix. Anthropic stated the flaw "falls outside our current threat model."

The persistent auth exploitation (Zenity Labs): Partial mitigation. Policy-based "Ask before acting" option added.

That is a harder problem. But "harder" is not the same as "outside our threat model."

The Supply Chain Angle

What Enterprises Should Do

If your organization has employees using Claude extensions, here is what matters now.

Claude's Extension Ecosystem Has a Confused Deputy Problem

A Zero-Click Takeover, No Assembly Required

Four Teams, Four Critical Flaws, Three Months

The Confused Deputy Problem

Anthropic's Inconsistent Response

The Supply Chain Angle

What Enterprises Should Do

The Bigger Picture

Keep Reading

Vercel's Breach Ran Through an AI Tool Nobody Scoped

An AI Agent Breached McKinsey in Two Hours. The 46 Million Messages Aren't the Scary Part.

Instagram's Meta AI Assistant Could Trigger Password Resets. The Verification Lived in the Old UI, Not in the Action It Called.

A Zero-Click Takeover, No Assembly Required

Four Teams, Four Critical Flaws, Three Months

The Confused Deputy Problem

Anthropic's Inconsistent Response

The Supply Chain Angle

What Enterprises Should Do

The Bigger Picture