Severe vulnerability in 7% of OpenClaw skills, over 30,000 instances exposed

Composio has published a security analysis report for OpenClaw. It’s pretty serious stuff. Of the 3,984 skills distributed on SkillHub (Skills Marketplace), 283 (7.1%) contained vulnerabilities, and over 30,000 instances were left exposed to the Internet. This means that behind the scenes of the “convenient AI agent,” users’ emails, chats, API keys, and passwords were exposed to attackers for several weeks.

What is OpenClaw?

OpenClaw is an AI agent framework that connects to services such as WhatsApp, Slack, Gmail, and Telegram and autonomously executes tasks via over 50 integrations. It has a mechanism to install skills (extensions) from a distribution marketplace called SkillHub.

Many people have probably seen it introduced on YouTube or X saying, “Just plug it in and the AI will do everything for you!” The problem is that “it will do everything for you” actually meant “it will do everything for the attacker as well.”

SkillHub Supply Chain Risk

According to Snyk’s analysis, one of the most downloaded Twitter manipulation skills was connected to malicious infrastructure. The attack’s distribution method is gradual and designed so that by the time users notice it, it’s already too late.

graph TD
    A[スキルインストール] -->|スキル概要が指示| B[前提依存関係のインストール]
    B -->|悪意あるページへ誘導| C[任意コマンド実行]
    C -->|難読化ペイロードのデコード・実行| D[第2段階スクリプトのダウンロード]
    D --> E[認証情報窃取・データ送信]

Prompt users to install external dependencies by stating “Prerequisites must be installed” in the skill summary description.
Execute arbitrary commands on the malicious page you are directed to
Decode and execute obfuscated payload (malicious code body)
Download and run the second stage main script (stager). A stager is an intermediate loader that downloads the main body, and is used as a method to keep the initial infection code small and obtain the full-scale attack code later.

The most dangerous thing to think is that “it’s safe because it’s a popular skill.” Skills with a large number of downloads are more attractive targets for attackers, and popular skills were actually contaminated. In the case of npm, it’s the same scenario as a popular package being hijacked, but in the case of OpenClaw, the damage is far greater because skills have direct access to email, chat, and the file system.

This method is similar in structure to the previously discussed AMOS infection chain via OpenClaw SKILL.md. The trick was to embed malicious instructions in the skill repository SKILL.md and force the AI agent to download and execute the malware itself. This attack via SkillHub is more of a classic social engineering attack, but it is still a supply chain attack in that it “contaminates the distribution route for skills.”

Similar supply chain attacks are becoming more active in the npm ecosystem. In the SANDWORM_MODE campaign, crypto keys and CI secrets were stolen targeting users of major AI development tools such as Claude Code, Cursor, and VS Code. Additionally, GlassWorm exploits Open VSX extension dependencies to achieve transitive infection, making the entire toolchain used by AI developers the target of supply chain attacks.

Structural problems of prompt injection

OpenClaw pulls in text from multiple sources, including emails, WhatsApp messages, and web search results. If an attacker embeds commands in these texts, there is a risk that the agent will interpret them as “instructions from the user” and execute them. This is a prompt injection attack.

“Due to the nature of LLM’s auto-regressive architecture, complete protection is not possible,” the report notes. This is the crux: this is not a bug in OpenClaw, but a fundamental limitation of LLM-based agents. It’s not a problem that can be fixed with a patch. The number of paths amplifies risk, with each of the more than 50 integrated services serving as entry points for attacks.

As an actual attack example, Clinejection used a prompt injection embedded in the title of a GitHub issue to hijack the Cline agent and install malicious npm packages on 4,000 development machines. OpenClaw ingests text from over 50 services, so the attack surface is much wider than in Cline’s case.

The latest trends in prompt injection countermeasures include GitHub’s agent execution platform and OpenAI IH-Challenge efforts, which explores approaches based on isolation on the infrastructure side and model training. However, there is currently no universal solution.

Simon Willison’s “Fatal Triad”

There is a concept called the “Lethal Triad” proposed by security researcher Simon Willison. There are three conditions that make AI agents high risk.

Conditions	OpenClaw status
Right of access to personal data	Direct access to file systems, credentials, and systems
Exposure to untrusted content	Constant capture of emails, messages, and web search results
External communication ability	Data transmission possible with integration of over 50 services

All 3 conditions are met. And at the worst level.

If even one of them applies, you need to be careful, but OpenClaw satisfies all three at full throttle. A compromised agent hands over access to all integrated services to the attacker. It reads all your WhatsApp messages and sends you emails from Gmail without your permission. This happens with just one prompt injection.

OAuth token exposure and instance issues

It turns out that the path where OpenClaw stores OAuth credentials is agents/<agentId>/agent/auth-profiles.json. Because this file exists in plain text on disk, an attacker who gains access to the agent can obtain OAuth tokens for various services such as Slack and Gmail. By using stolen tokens, you can impersonate real users and act on various services.

The problem with development was even more serious. In the early versions, the setting was “auto-approve connections from localhost,” and Internet connections via reverse proxies were also treated as approved. This misconfiguration resulted in more than 30,000 OpenClaw instances being exposed to the internet over a 30-day period. In other words, anyone in the world could access it. The instances where OAuth tokens were stored in plain text were directly accessible from the internet for a month.

Similar instance exposure issues occur on other AI platforms. Langflow CVE-2026-33017 was exploited within 20 hours after the unauthenticated RCE vulnerability was published. In n8n multiple RCE vulnerability, 24,700 instances remained unpatched. The pattern of exposing instances of autonomous agents and workflow automation tools to the Internet is repeating.

Memory Poisoning

OpenClaw stores memory as a collection of Markdown files. When a compromised agent modifies its own memory files, it creates a condition where the agent continues to execute certain hidden instructions without the user’s knowledge. While skill infections are acute (fired upon installation), memory contamination is latent and difficult to detect.

The scary thing is that agents with corrupted memory continue to operate ostensibly normally. From the user’s perspective, the system is “running normally,” but behind the scenes it continues to send information to the outside world according to the attacker’s instructions. There is no way to detect it.

The details of this attack method are explained in AI agent memory injection attack. By embedding persistent instructions in memory files, the malicious behavior continues even after the agent is restarted. OpenClaw is particularly dangerous because even though it is stored in a human-readable Markdown file, it can be automatically modified via over 50 integration services.

Applicable to OWASP Top 10 for LLM Agents

OWASP is an organization that provides a standard catalog of security risks and also publishes a Top 10 list for LLM agents. OpenClaw falls under multiple items.

Risk items	OpenClaw status
A01: Prompt injection	Web search/message/skill injects instructions
A03: Excessive autonomy	Access to file system root/credentials, no privilege boundaries
A05: Memory pollution	Centralized memory management without distinguishing between trust levels and expiration dates
A07: Insufficient privilege separation	Untrusted input and elevated execution are processed in the same memory space

It is quite bad if it falls under 4 items. The OWASP Top 10 is originally a list of “minimum things you should follow,” and falling into this category means that you haven’t done the basic security design.

Pile of Mac mini M4s lined up on Mercari

On a different note, a large number of Mac mini M4s have recently been listed on Mercari. The timing coincided with the OpenClaw boom, and I suspect that ordinary users bought it with enthusiasm, thinking, “I’ll automate it with an AI agent,” but either realized there were security issues, weren’t able to use it as much as I expected, or simply got tired of it and sold it.

This is largely the fault of AI surprisers. As a result of a large number of videos and posts saying things like “If you buy an M4 Mac mini and install OpenClaw, your life will change!”, there must have been many people who bought the hardware without understanding the technical background, set it up, connected their personal information, and were exposed to vulnerabilities like this report.

The wondermongers never mention security risks. They only shout “Awesome! Convenient! Revolution!” and then remain silent when a problem arises. How many surpriseers told their followers that 7% of skills contained malicious code? How many wonder-mongers corrected the fact that 30,000 instances were in full view of the internet? It’s probably close to zero.

They instigate it and sell it, and when a problem arises, they pretend not to know. And Mac mini is lined up on Mercari. This cycle is not unique to OpenClaw, but is repeated across AI tools.

Recommended measures

If you still want to use OpenClaw, the following is required at a minimum.

Minimum privileges in local environment

Runs under a dedicated isolated user account, without root privileges
Allocate only specific directories needed for work without mounting the entire home directory
Do not mount Docker socket

Regarding isolation using the OS standard sandbox function, we compared the practical differences between macOS sandbox-exec and Windows sandbox in Local isolation execution of AI agent.

Limiting Network Exposure

Bind gateway to 127.0.0.1 and only allow access via VPN or WireGuard/Tailscale
Added firewall settings by IP address range
Do not leave default settings as is. 30,000 instances are victims of default settings

Approaches such as Cloudflare’s WAF for AI apps that detect and block prompt injections and PII leaks at the network layer are also emerging. It is effective to add a layer of defense not only on the agent code side but also on the front-end network.

Permission management for integrated services

Narrow down the authorization scope by resource (“this calendar” instead of “all calendars”)
Require a human approval step for destructive operations such as deletion, bulk movement, and sharing
Rotate OAuth tokens regularly
Carry out a monthly inventory of access rights

Operation visualization

If it is not possible to track what an agent did and when, it becomes difficult to investigate the cause of a breach. A mechanism is needed to record tool execution logs externally and alert on abnormal access patterns. Design Principles for Putting AI Coding Agents into Production also addresses the importance of operational logging and privilege separation.

The report’s authors advise consumers to “avoid it at this time,” but personally I think it’s fair to say more strongly. The problems pointed out here are not specific to OpenClaw, but are common to all AI agents that have file system access, external service cooperation, and various text input paths. Instead of just adding it because it seems convenient, you should understand what it connects to, what is allowed, and what can happen before using it.

The reason I created Kana Chat was because I wanted to structurally eliminate this kind of risk. There’s no need to trade security for convenience.