Project Glasswing: the billion-dollar coalition using AI to fortify the world’s most critical software
Project Glasswing showed up to change the cybersecurity game in a way few people saw coming. And when we say few, we mean pretty much everyone, because what went down over the past few weeks inside this initiative defies what most security professionals thought was possible with artificial intelligence applied to code.
Picture an AI that, in just a few weeks, found thousands of critical flaws across every major operating system and browser in the world, including vulnerabilities that had been hiding for decades, slipping past millions of automated tests and years of human review. This isn’t science fiction, it’s not a lab concept, and it’s not corporate press release hype either. It actually happened, in a project that may very well mark a turning point in how the world handles digital security.
That AI is Claude Mythos Preview, Anthropic’s frontier model that hasn’t even been released to the public yet but is already redefining what AI can do in the digital security space. 🔍 The model operates at such an advanced level of technical reasoning that it can not only spot suspicious patterns in code but also understand context, simulate attack vectors, and propose fix paths with a precision that impressed even the most seasoned engineers at the companies involved in the project.
And the good news is that, at least for now, this power is being used for good.
A heavyweight coalition tackling a massive problem
Anthropic brought together some of the biggest names in global tech, including AWS, Apple, Broadcom, Google, Microsoft, Cisco, NVIDIA, CrowdStrike, Palo Alto Networks, and JPMorganChase, along with the Linux Foundation, around a clear mission: use AI to find and fix critical flaws before someone with bad intentions does it first. This isn’t some shallow PR partnership. Each of these companies brought real system access, critical infrastructure, and highly specialized security teams to the table, creating a collaborative environment you rarely see in tech, especially when it involves sensitive vulnerabilities.
The reasoning behind it is simple and terrifying at the same time. If an AI can already outperform most human experts at finding and exploiting software vulnerabilities, it’s only a matter of time before similar capabilities end up in the wrong hands. 😬 And when that happens, the world needs to be ready, with patches applied, systems updated, and defenses locked down. The window to act proactively is right now, and Project Glasswing is the most concrete answer that’s emerged so far for this challenge.
Beyond the launch partners, Anthropic also extended access to Mythos Preview to a group of more than 40 additional organizations that build or maintain critical software infrastructure. These companies can use the model to scan and secure both proprietary systems and open-source projects. To make all of this possible, Anthropic is committing up to $100 million in usage credits for Mythos Preview, plus $4 million in direct donations to open-source security organizations, including $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation.
Acting now, in a coordinated way, could be the difference between a more secure digital future and a scenario of global-scale cyber chaos. The logic is the same as any technology race: whoever gets there first sets the rules. And in this case, getting there first means identifying the gaps before they get exploited, fixing what can be fixed, and documenting what still poses risk, building a cybersecurity knowledge base that goes far beyond what any human team could produce on its own in years of continuous work.
The current state of cybersecurity and why AI changes everything
The software everyone uses every day, from banking systems and medical records to logistics networks and energy infrastructure, has always had bugs. Many are harmless, but some represent serious security flaws that, if discovered by attackers, allow them to hijack systems, disrupt operations, and steal sensitive data on a massive scale.
There’s no shortage of examples. Cyberattacks have already caused serious damage to corporate networks, healthcare systems like the WannaCry incident that hit Britain’s NHS, energy infrastructure like the Colonial Pipeline attack in the United States, airports across Europe, and government agencies in multiple countries. On the geopolitical front, state-sponsored attacks represent a constant threat to both civilian and military infrastructure. Even smaller attacks targeting individual hospitals or schools can cause substantial economic damage, expose sensitive data, and put lives at risk. Estimates for the global financial cost of cybercrime are hard to pin down precisely, but they hover around $500 billion per year.
Many of these flaws went unnoticed for years because finding and exploiting them required expertise that only a handful of specialists in the world possessed. With the latest frontier AI models, the cost, effort, and skill level needed to discover and exploit software vulnerabilities have dropped dramatically. Over the past year, AI models have become increasingly effective at reading and reasoning about code, showing an impressive ability to identify flaws and develop ways to exploit them. Claude Mythos Preview represents a qualitative leap in these cyber capabilities, finding vulnerabilities that survived decades of human review and developing increasingly sophisticated exploits.
A decade after the first DARPA Cyber Grand Challenge, frontier AI models are finally becoming competitive with the best humans at discovering and exploiting vulnerabilities. Without the right safeguards, these capabilities could be used to exploit the many existing flaws in the world’s most important software, making cyberattacks of all kinds far more frequent and destructive. 🛡️
How Claude Mythos Preview works in practice
Mythos Preview is not your typical vulnerability scanner. Traditional security analysis tools work based on known signatures, predefined patterns, and lists of already-cataloged CVEs. They’re useful, but they have a clear ceiling: they can’t reason about what hasn’t been discovered yet. Claude Mythos Preview operates in a completely different way. It analyzes code the way a senior engineer would, understanding intent, data flow, interactions between components, and potential failure points that only surface when you combine multiple factors at the same time. This kind of chained reasoning is exactly what allows it to find flaws that went undetected for decades.
During the tests conducted as part of Project Glasswing, the model was deployed on codebases of widely used operating systems and browsers with billions of active users. The result was an extensive list of zero-day vulnerabilities, meaning flaws that were unknown to the software developers themselves and for which no fix existed.
Three concrete examples illustrate the level of what was found:
- OpenBSD, 27 years of a hidden flaw: Mythos Preview found a 27-year-old vulnerability in OpenBSD, an operating system renowned as one of the most secure in the world, widely used to run firewalls and other critical infrastructure. The flaw allowed an attacker to remotely crash any machine running the system simply by connecting to it.
- FFmpeg, 16 years and 5 million blind tests: The model also discovered a 16-year-old vulnerability in FFmpeg, the library used by countless software applications to encode and decode video. The flaw was in a line of code that automated testing tools had executed five million times without ever catching the problem.
- Linux Kernel, autonomous exploit chain: Completely on its own, the model found and chained together multiple vulnerabilities in the Linux kernel, the software that runs the majority of the world’s servers, allowing an attacker to escalate from regular user access to full control of the machine.
All of these vulnerabilities were reported to the maintainers of their respective software projects and have already received fixes. For many other discovered flaws, Anthropic published a cryptographic hash of the details and plans to release the full data after patches are applied. The model was able to identify nearly all of these vulnerabilities, and develop many of the related exploits, in a fully autonomous manner, with zero human guidance. 🤯
Another standout aspect is speed. Human bug bounty and pentest teams take weeks or months to audit a complex system with real depth. Mythos Preview managed to do this at scale, covering multiple systems simultaneously, without sacrificing the granularity of the analysis. That doesn’t mean humans are no longer needed, quite the opposite. Security engineers from the partner companies were essential for validating discoveries, contextualizing risks, and prioritizing fixes. What changed is that AI raised the ceiling on what’s achievable in terms of coverage and depth of analysis within a single work cycle. 🚀
What the partners are saying
The reactions from companies involved in Project Glasswing make it clear that the impact of Mythos Preview goes well beyond corporate marketing. Each of them brought practical perspectives that underscore the gravity of the moment.
Cisco highlighted that AI capabilities have crossed a threshold that fundamentally changes the urgency needed to protect critical infrastructure, and that the old methods of system hardening are no longer enough. AWS mentioned it is already using Mythos Preview in its own security operations and applying it to critical codebases, where the model is already helping strengthen the code. Microsoft, through its EVP of Cybersecurity Igor Tsyganskiy, stated that Mythos Preview showed substantial improvements over previous models when tested against CTI-REALM, the company’s open-source security benchmark.
CrowdStrike was blunt in pointing out that the window between a vulnerability being discovered and being exploited by an adversary has collapsed, shrinking from months to minutes with the help of AI. The Linux Foundation emphasized that the project offers a concrete path to democratizing security in the open-source ecosystem, where maintainers have historically had to deal with security issues on their own. JPMorganChase highlighted that the project provides a unique opportunity to evaluate next-generation AI tools for defensive cybersecurity. And Google reaffirmed its commitment to tools like Big Sleep and CodeMender, while expressing satisfaction at making Mythos Preview available to participants through Vertex AI.
Palo Alto Networks and NVIDIA also contributed practical assessments, with Palo Alto noting that the model identified complex vulnerabilities that previous-generation models had completely missed, and warning that organizations need to prepare for AI-assisted attackers delivering more frequent, faster, and more sophisticated attacks.
Technical performance and benchmarks
The cyber capabilities of Mythos Preview are a direct result of its advanced skills in agentic coding and reasoning. In the published benchmarks, the model achieved the highest scores ever recorded across a range of software coding tasks, including SWE-bench Verified, Pro, and Multilingual, SWE-bench Multimodal, and Terminal-Bench 2.0. Cybersecurity tests like CyberGym reinforce the substantial gap between Mythos Preview and Anthropic’s next best model, Claude Opus 4.6.
On BrowseComp, Mythos Preview scored higher than Opus 4.6 while using 4.9 times fewer tokens. On Terminal-Bench 2.0, with timeout limits extended to four hours and using Terminal-Bench 2.1 updates, the model hit a 92.1% score. These numbers aren’t just incremental improvements. They represent a generational leap in the capability of language models applied to complex software engineering and security tasks.
More information about the model’s capabilities, safety properties, and general characteristics can be found in the Claude Mythos Preview system card published by Anthropic.
What’s at stake for global cybersecurity
Artificial intelligence applied to cybersecurity isn’t a new concept. Companies in the space have been using machine learning for anomaly detection, threat classification, and automated incident response for several years. But what Project Glasswing represents is something qualitatively different. We’re talking about a model capable of complex reasoning that understands software systems with a depth that goes beyond statistics and historical patterns. This opens a new frontier where AI isn’t just a support tool but an active agent in discovering risks that aren’t on anyone’s radar yet.
The impact on the security ecosystem is enormous. Vulnerabilities that sit exposed for years without being discovered are the primary vector for sophisticated attacks, including those attributed to state-sponsored groups. Industrial espionage campaigns, ransomware targeting critical infrastructure, and large-scale compromises typically rely on exactly this kind of silent flaw, the one nobody knew existed until someone exploited it. Reducing this stockpile of unknown vulnerabilities is therefore one of the most effective ways to structurally raise the cost and difficulty of cyberattacks, not just reactively.
There is, of course, another side to this coin that needs to be taken seriously. If Claude Mythos Preview can find flaws with this kind of efficiency, a model with similar capabilities in the hands of malicious actors would represent an unprecedented risk. Anthropic and its partners are clearly aware of this, and the very structure of Project Glasswing exists, in part, to create a defensive advantage before the offensive landscape becomes dominant. The company does not plan to make Mythos Preview available to the general public. The eventual goal is to allow users to leverage Mythos-class models at scale and safely, but getting there will require advancing the development of safeguards that detect and block the model’s most dangerous outputs. Anthropic plans to ship new safeguards with a future Claude Opus model, allowing them to be refined with a model that doesn’t carry the same level of risk as Mythos Preview. 🔐
Project Glasswing’s long-term plans
Today’s announcement is just the beginning of a long-term effort. Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities in their foundational systems, which represent a very significant portion of the world’s shared cyber attack surface. The work is expected to focus on tasks like local vulnerability detection, black box binary testing, endpoint protection, and penetration testing of systems.
After the period covered by usage credits, Claude Mythos Preview will be available to participants at $25 per million input tokens and $125 per million output tokens. Participants can access the model through the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.
Anthropic intends for the work to grow in scope and continue for many months, sharing as much as possible so other organizations can apply the lessons to their own security. Partners will share, to the extent possible, information and best practices with each other. Within 90 days, Anthropic will publish a report on what was learned, including fixed vulnerabilities and improvements that can be disclosed. The company will also collaborate with leading security organizations to produce a set of practical recommendations on how security practices should evolve in the AI era, potentially covering:
- Vulnerability disclosure processes
- Software update processes
- Open-source and supply chain security
- Software development lifecycle and secure design practices
- Standards for regulated industries
- Triage scalability and automation
- Patch automation
Anthropic has also been in ongoing discussions with U.S. government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. The company stresses that protecting critical infrastructure is a top national security priority for democratic nations, and that the emergence of these cyber capabilities is yet another reason for the U.S. and its allies to maintain a decisive lead in AI technology.
Why this move matters right now
The timing of Project Glasswing is no coincidence. The artificial intelligence sector has advanced at breakneck speed over the past two years, and frontier models are reaching a point where their capabilities are starting to overlap with areas historically reserved for highly trained specialists. Offensive security is one of those areas. The security community had already been discussing, at specialized forums and conferences like DEF CON and Black Hat, the potential of advanced language models to automate vulnerability discovery and exploitation. What Glasswing does is turn that theoretical debate into concrete action, putting offensive capability in service of defense in a structured and coordinated way.
For the partner companies, participating in the project is also a clear acknowledgment that the current security model, based primarily on periodic audits, bug bounty programs, and reactive monitoring, is no longer sufficient for the sophistication level of today’s threats. Integrating a model like Mythos Preview into the software development and maintenance cycle is a step toward a genuinely proactive security posture, where flaws are found and fixed before they reach production, not after an incident has already occurred.
The project’s long-term vision includes the possibility that an independent, third-party body capable of bringing together public and private sector organizations would be the ideal space to carry forward large-scale cybersecurity projects like this one. Anthropic invites other members of the AI industry to join the effort to help define standards for the sector.
What Project Glasswing makes clear, above all else, is that artificial intelligence has already crossed an important line in terms of technical capability applied to cybersecurity. The debate is no longer about whether this will happen, but about how to ensure this power is directed responsibly, with proper governance, transparency in processes, and collaboration among those who build the systems, those who protect them, and those who develop the AI tools that will shape this new landscape. The race has already started, and the clock is ticking for everyone. 🌐
