What is going on with OpenClaw
OpenClaw has become one of the most popular autonomous AI agents in recent months, racking up millions of users across different countries. Previously known as Clawdbot and Moltbot, the project is open-source and runs directly on the user’s local machine, which gives it privileged access to the operating system to carry out tasks autonomously. However, an alert issued by CNCERT — China’s national computer network emergency response team — has brought a pretty concerning scenario to light. The report, published via WeChat, points to critical flaws ranging from prompt injection to data exfiltration mechanisms that work without any user interaction. This means sensitive information belonging to individuals and businesses could be vulnerable right now, and that is exactly where the danger lies.
The default security settings in OpenClaw are described as inherently weak by the research community and by CNCERT itself, which considerably widens the attack surface. In practice, attackers can exploit these gaps to take control of devices, steal confidential data, and in the most severe cases, bring entire systems to a halt. This kind of cyber threat is not exactly new in the artificial intelligence space, but it takes on a different dimension when we are talking about an agent that has permissions to read files, execute terminal commands, and interact with external APIs without constant human oversight. The fact that millions of people are using the tool daily makes the situation even more urgent.
How the Prompt Injection attack works on OpenClaw
Prompt injection is a technique where an attacker inserts malicious instructions into text that the AI agent will process. In OpenClaw’s case, this can happen in surprisingly simple ways. Imagine you ask the agent to analyze a document you received via email or summarize the content of a web page. If that content contains hidden instructions — something like a command disguised in invisible text or special formatting — OpenClaw may interpret that snippet as a legitimate order and execute it. This happens because the language model behind the agent cannot always distinguish between what is a user instruction and what is potentially malicious external content. The flaw is structural and affects how the agent processes any type of input.
This variation is technically known as indirect prompt injection (IDPI) or cross-domain prompt injection (XPIA). Unlike direct prompt injection, where the attacker interacts directly with the language model, in IDPI adversaries exploit legitimate and seemingly harmless agent features — like web page summarization or content analysis — to inject manipulated instructions. The range of consequences is broad: researchers have already documented scenarios ranging from evading AI-based ad review systems and manipulating automated hiring decisions, to SEO poisoning and generating biased responses through the suppression of negative reviews.
OpenAI itself acknowledged the severity of this type of attack in a recent blog post. The company emphasized that prompt injection-style attacks are evolving beyond simply inserting instructions into external content, now incorporating elements of social engineering. In the company’s words, AI agents are increasingly capable of browsing the web, retrieving information, and taking actions on behalf of the user — capabilities that are useful but also create new avenues for attackers to try to manipulate the system.
The CNCERT report details how more sophisticated variations of this attack can chain multiple commands together, creating what researchers call an exploit chain. It works like this: the first injected instruction makes OpenClaw disable some internal verification mechanism, the second requests access to a specific system directory, and the third sends the collected data to an external server. All of this can happen in a matter of seconds, without the user noticing any abnormal behavior in the interface. It is a scenario that is especially concerning for companies that have adopted OpenClaw for internal task automation, since documents shared between teams can become unintentional attack vectors.
On top of that, independent researchers have already demonstrated that prompt injection in OpenClaw can be used to modify the agent’s behavior persistently throughout an entire usage session. This means that after processing a single compromised file, the agent starts following the attacker’s instructions in all subsequent interactions, even when the user provides completely different commands. This persistence makes detection much harder, because the agent’s responses continue to look normal to whoever is using it, while behind the scenes it is executing unauthorized actions. The security of the entire environment becomes compromised from a single entry point.
Data exfiltration without a single user click
Perhaps the most alarming aspect of the report is the detailed description of how data exfiltration can occur without any action on the user’s part. Researchers at PromptArmor discovered last month that the link preview feature in messaging apps like Telegram and Discord can be turned into a data exfiltration channel when the user communicates with OpenClaw through indirect prompt injection.
The attack mechanics are clever. The AI agent is manipulated into generating a URL controlled by the attacker. This URL contains dynamically generated query parameters that carry sensitive data the model knows about the user. When that URL is rendered in the messaging app as a link preview, the preview mechanism itself automatically makes the request, transmitting confidential information to the attacker’s domain — all without the user needing to click on anything. PromptArmor explained that in agentic systems with link previews, data exfiltration can happen immediately when the AI agent responds to the user.
OpenClaw, by design, has the ability to access the local file system, run scripts, and make network requests to complete the tasks it is given. The problem is that these same capabilities can be hijacked through the identified vulnerabilities. An attacker who manages to inject commands into the agent can instruct it to locate files with specific extensions — like spreadsheets, text documents, and databases — silently compress them and send them to a remote endpoint. Since the network traffic generated by OpenClaw during normal operation already includes external requests to various APIs and services, this malicious transmission blends right into the legitimate data flow.
CNCERT classified this attack vector as particularly dangerous for sectors that deal with intellectual property, financial data, and sensitive personal information. The Chinese entity was emphatic in stating that for critical sectors like finance and energy, these breaches can lead to the leakage of essential business data, trade secrets, and code repositories, or even result in the complete paralysis of entire business systems, causing incalculable losses. Law firms, tech companies, financial institutions, and healthcare organizations are among the most exposed, precisely because they tend to store large volumes of confidential data on local machines where OpenClaw may be running. Data exfiltration in this context is not just a technical nuisance — it is a significant regulatory risk, especially for organizations that need to comply with legislation like CCPA in the United States or GDPR in Europe. 😬
Other threats identified by CNCERT
Beyond the risks of prompt injection and exfiltration, CNCERT highlighted three other concerning attack vectors related to OpenClaw:
- Accidental and irreversible deletion of critical data: the agent may misinterpret user instructions and end up permanently deleting essential information. Since OpenClaw operates with elevated permissions on the system, a simple misinterpretation of a command can have devastating consequences.
- Malicious skills in repositories like ClawHub: bad actors can upload compromised skills to marketplaces and public repositories. When installed by unsuspecting users, these skills execute arbitrary commands or deploy malware directly onto the system.
- Exploitation of recently disclosed security vulnerabilities: flaws in OpenClaw’s code that have been recently made public can be exploited by attackers to compromise systems and steal sensitive information before patches are applied.
Each of these vectors on its own would already be cause for concern. Combined, they form an ecosystem of risks that demands immediate attention from both developers and end users.
China restricts OpenClaw use in government agencies
The severity of the situation led Chinese authorities to take concrete action. According to a Bloomberg report, the Chinese government has begun restricting the use of OpenClaw in state-owned enterprises and government agencies, banning the execution of applications based on the agent on work computers. The measure aims to contain the security risks identified by CNCERT and other oversight bodies. The ban even extends to family members of military personnel, which gives you an idea of just how concerned the authorities are.
This decision reflects a broader trend of governments around the world reassessing the adoption of autonomous AI tools in sensitive environments. The viral popularity of OpenClaw has brought undeniable productivity gains, but it has also exposed organizations to risks that many IT managers simply did not anticipate. The balance between innovation and security has never been this delicate.
Malware campaigns exploit OpenClaw’s popularity
As if the technical vulnerabilities were not enough, OpenClaw’s fame has also attracted criminals who have nothing to do with the software’s flaws themselves. According to Huntress, bad actors have created malicious repositories on GitHub that impersonate legitimate OpenClaw installers to distribute malware like Atomic Stealer, Vidar Stealer, and a Golang-based proxy malware called GhostSocks. The installation instructions use ClickFix-style techniques to convince users to run commands that download and install the malicious software.
The campaign did not target a specific industry but broadly hit users who were trying to install OpenClaw. The malicious repositories contained download instructions for both Windows and macOS environments. What made this campaign particularly effective was the fact that the malware was hosted on GitHub — a platform generally considered trustworthy — and the malicious repository had become the top suggestion in Bing’s AI-powered search results for installing OpenClaw on Windows. This detail shows how the convergence of AI-powered search engines and social engineering can create sophisticated traps that fool even more experienced users.
What can be done right now to reduce the risks
The good news is that there are practical and accessible measures to minimize exposure to these cyber threats while the OpenClaw development team works on more robust fixes. CNCERT itself listed clear recommendations that are worth following:
- Strengthen network controls: preventing OpenClaw’s default management port from being exposed to the internet is the first step to avoiding unauthorized access.
- Isolate the service in a container: running OpenClaw inside an isolated environment — like a Docker container or a virtual machine — prevents an attacker from accessing sensitive data on the main system, even if the agent is compromised via prompt injection.
- Do not store credentials in plain text: API keys, passwords, and access tokens should never be saved in plain text files accessible by the agent.
- Only download skills from trusted sources: avoiding unknown repositories and verifying the origin of each extension before installation significantly reduces the risk of compromise via ClawHub.
- Disable automatic skill updates: this ensures that no malicious changes are applied without prior manual review.
- Keep the agent up to date at all times: the open-source community has been responding quickly to vulnerabilities with patches and fixes.
It is also worth carefully reviewing which extensions, plugins, and integrations are enabled in your OpenClaw environment. Each additional integration represents an extra attack surface, and many of them may not have gone through the same level of security auditing as the core tool. Disabling everything that is not essential to your workflow is a straightforward way to reduce risk. For teams and companies, implementing a content review process before feeding the agent with external documents is equally important, since this helps identify potential prompt injection attempts before they reach the model.
Setting up firewall rules that limit OpenClaw’s outbound connections to only the strictly necessary domains and endpoints is also a relevant layer of protection, blocking any attempt to send data to unknown servers. Text analysis tools that detect suspicious patterns in documents already exist and can be integrated into the workflow pipeline without much effort.
The reality is that autonomous AI agents are here to stay and offer real productivity gains, but adoption needs to come hand in hand with a conscious approach to security. Following the project’s official channels and reports published by entities like CNCERT ensures you stay on top of new discoveries and can act quickly. Cyber threats evolve at the same speed as technology, and being prepared is the best defense any person or organization can have right now. 🔒
