Arm launches its own chip and dives headfirst into the AI infrastructure race
AI infrastructure is going through one of the biggest transformations in its history, and Arm just dropped a bombshell on the market.
After more than 35 years focused on licensing its technology to other manufacturers, the company took a step few saw coming: it launched its own silicon chip, production-ready and built for datacenters.
The Arm AGI CPU isn’t just another processor on the market — it was designed from the ground up to meet the demands of a new AI era, where software agents operate continuously, at global scale, without a human needing to be in the loop for everything to work.
And when we talk about performance, the numbers are impressive:
- Up to 8,160 cores per rack in the standard air-cooled configuration
- Over 45,000 cores per rack in the liquid-cooled version, in partnership with Supermicro
- More than 2x performance per rack compared to the latest x86 systems
But what really stands out here isn’t just the raw technical performance.
It’s the timing of this chip’s arrival — and who’s already standing with Arm on this journey.
Meta, OpenAI, Cloudflare, Cerebras, SAP, SK Telecom, and a whole lineup of major names in the AI ecosystem are already on board. Meta, in fact, is the lead partner and co-developed the chip.
Let’s break down what makes the Arm AGI CPU so relevant for the future of cloud computing and large-scale AI infrastructure. 🚀
Why did Arm decide to build its own chip now?
For decades, Arm’s business model worked in a very specific way: the company developed processor architectures and licensed that technology to other manufacturers, like Qualcomm, Apple, Samsung, and a massive list of partners around the world. That model generated billions in revenue and put Arm at the heart of virtually every smartphone on the planet. But the market changed — and the speed at which artificial intelligence is evolving created a pressure no tech company can afford to ignore.
The problem is that the traditional chip development cycle, where Arm designs the architecture, licenses it to a partner, the partner manufactures it and brings it to market, takes time. Time that, at the current pace of the AI race, can mean falling behind. With the Arm AGI CPU, the company cut that cycle short and started delivering production-ready silicon directly — built on the Arm Neoverse platform, the same foundation that already powers solutions like AWS Graviton, Google Axion, Microsoft Azure Cobalt, and NVIDIA Vera. The difference now is that, for the first time, Arm isn’t just providing the blueprint — it’s delivering the finished product.
This shift isn’t a small pivot. It’s a complete redefinition of Arm’s role in the global tech ecosystem. The company now offers its customers three options: build custom silicon from the Arm architecture, integrate platform-level subsystems like the Arm Compute Subsystems, or simply adopt processors designed and manufactured by Arm itself. More choice, more flexibility, more speed for anyone who needs to scale AI infrastructure.
Beyond that, there’s a bigger picture here worth understanding. The AI infrastructure market is growing at a staggering pace, and datacenters worldwide are being redesigned to support increasingly large language models, autonomous agents, and real-time reasoning systems. Arm identified a real gap between what traditional x86 processors deliver and what new AI workloads demand — especially when it comes to energy efficiency, parallelism, and the ability to scale horizontally without costs spiraling out of control. The AGI CPU was built to fill that exact space. 💡
The era of agentic AI and the central role of the CPU
There’s an important concept behind the creation of the Arm AGI CPU that needs to be clear: the rise of agentic AI. Historically, the bottleneck in computing was always the human. The speed at which people could interact with systems defined the pace of work. But in the age of AI agents, that limitation disappears. Software agents coordinate tasks, interact with multiple models, and make decisions in real time — no pauses, no breaks, no waiting for someone to click a button.
When AI systems begin operating continuously and workloads grow in complexity, the CPU becomes the component that sets the pace for the entire modern infrastructure. In a modern AI datacenter, the processor manages thousands of distributed tasks — orchestrating accelerators, managing memory and storage, scheduling workloads, moving data between systems, and now with agentic AI, coordinating the fan-out across a massive number of simultaneous agents.
This shift places entirely new demands on the CPU. It’s not enough to have more clock speed or more cores — what’s needed is a fundamental evolution in the processor, something designed from the start for this new paradigm. And that’s exactly where the Arm AGI CPU enters the picture.
What makes the Arm AGI CPU different from everything out there today
When Arm talks about more than 45,000 cores per rack in the liquid-cooled version, that’s not just a flashy number for a presentation slide. We’re talking about a compute density that completely changes the logic of how datacenters are designed.
Let’s get into the technical details. The Arm reference server configuration is a 1OU dual-node design, where each blade houses two chips with dedicated memory and I/O, totaling 272 cores per blade. These blades were designed to fully populate a standard 36kW air-cooled rack — that’s 30 blades delivering a total of 8,160 cores. In the 200kW liquid-cooled version, developed in partnership with Supermicro, the rack accommodates 336 Arm AGI CPU units, surpassing 45,000 cores.
In this configuration, the Arm AGI CPU delivers more than 2x performance per rack compared to the latest x86 systems. That gain comes from three fundamental advantages of the architecture:
- Category-leading memory bandwidth: this translates into more effective execution threads per rack. On x86 CPUs, performance degrades as cores compete for resources under sustained load.
- High-performance, single-thread-efficient Arm Neoverse V3 cores: each Arm thread gets more work done than legacy architectures.
- Compounding effect: more usable threads combined with more work per thread result in massive performance gains per rack.
The design philosophy here is fundamentally different from what we’ve seen in recent years with x86 chips. Instead of trying to adapt an existing architecture for AI — tacking on specific instructions and embedded accelerators — Arm’s team started from the concept that every element of the processor, from operating frequency to memory and I/O architecture, needed to be optimized for massively parallel, high-performance workloads in densely populated rack deployments.
The architecture’s native scalability is another key differentiator. In traditional systems, scaling means adding more machines, more racks, more cables, more network complexity — and all of that translates into added latency and rising operational costs. The AGI CPU was conceived so that multiple chips work cohesively, allowing cloud providers to build far more efficient infrastructure without multiplying management complexity at the same rate. 🔥
Who’s already betting on the Arm AGI CPU
Meta, OpenAI, and Cloudflare aren’t names that typically enter partnerships without very careful technical and strategic analysis. The fact that these three companies, which have completely different infrastructure usage profiles, are already on board with the Arm AGI CPU ecosystem says a lot about the chip’s potential.
Meta is the lead partner and co-developed the Arm AGI CPU to optimize infrastructure at gigawatt scale for Meta’s family of apps, working alongside the company’s own custom MTIA accelerators. Santosh Janardhan, Meta’s Head of Infrastructure, highlighted that the platform significantly improves datacenter performance density and offers a multi-generational roadmap for the company’s evolving AI systems.
OpenAI is at the center of the race for the most advanced language models and needs infrastructure that supports ultra-high-volume inference. Sachin Katti, Head of Industrial Compute at OpenAI, confirmed that the Arm AGI CPU will play an important role in the company’s infrastructure, strengthening the orchestration layer that coordinates large-scale AI workloads and improving efficiency, performance, and bandwidth.
Cloudflare has a business model built on distributing compute at the network edge, where efficiency and latency are everything. Stephanie Cohen, Cloudflare’s Chief Strategy Officer, noted that the chip delivers high-performance, energy-efficient compute designed for the next generation of workloads.
The launch partner list goes beyond those names: Cerebras, F5, Positron, Rebellions, SAP, and SK Telecom are also working with Arm to deploy the chip and accelerate AI services across cloud, network, and enterprise environments. Commercial systems are already available to order from ASRockRack, Lenovo, and Supermicro.
Each of these players is solving a different problem with the same chip — and that’s exactly what validates Arm’s approach. A processor that works well for running AI agents at Cloudflare’s network edge while also scaling to meet OpenAI’s inference demands has to be incredibly versatile, incredibly efficient, and incredibly well-designed. Consistent performance across such different scenarios isn’t something you get with a generic architecture — it’s the result of very deliberate design decisions and a deep understanding of how the new generation of artificial intelligence works in practice.
For the infrastructure market as a whole, the signal these partners are sending is powerful. More than 50 leading companies across hyperscale, cloud, silicon, memory, networking, software, system design, and manufacturing are backing the expansion of the Arm compute platform into the silicon world. When the companies that understand AI best choose a chip, it influences the entire chain — from cloud providers to startups building on these platforms. 🤝
Reference server and contributions to the open ecosystem
To accelerate adoption even further, Arm is introducing the Arm AGI CPU 1OU Dual Node Reference Server, a server built on the standard DC-MHS form factor from the Open Compute Project. This isn’t a minor detail — it means Arm is aligning its hardware with open standards the industry already knows and uses.
The company plans to contribute the reference server design and supporting firmware, along with system architecture specifications, debug frameworks, and diagnostic and verification tools that can be applied to all Arm-based systems. More details are expected at the upcoming OCP EMEA Summit.
This open-source approach to hardware is strategic. The easier it is for manufacturers and cloud providers to adopt the Arm AGI CPU, the faster the surrounding software and tooling ecosystem develops — and that ecosystem is often worth just as much as the chip itself.
What this means for AI infrastructure in the years ahead
The arrival of the Arm AGI CPU on the market isn’t going to change everything overnight — no infrastructure transition works that way. But it marks the beginning of a shift in direction that will become increasingly evident over the next few years. The dominance of x86 architecture in servers and datacenters lasted decades, and for good reasons: the software ecosystem is massive, engineers are familiar with the platform, and raw performance kept growing steadily. But AI has created a new success metric for infrastructure — and in this new metric, energy efficiency, core density, and the ability to scale inference and reasoning workloads matter far more than simply having the highest clock speed.
For cloud providers, the math is straightforward: more performance per watt and more cores per rack means they can offer more AI compute capacity without needing to build proportionally larger datacenters or pay proportionally bigger energy bills. In a sector where operational costs are enormous and margin pressure is constant, this kind of efficiency gain translates directly into competitive advantage.
Worth noting that the Arm AGI CPU is only the first product in Arm’s new datacenter silicon lineup. Subsequent products are already committed, targeting best-in-class performance, scale, and efficiency. This development is happening in parallel with the Arm Neoverse CSS roadmap, ensuring that all of Arm’s datacenter customers advance together in terms of platform architecture and software compatibility.
And for those consuming these services — whether a startup building an AI-powered product or a large enterprise integrating autonomous agents into its workflows — the most visible impact should be felt in terms of speed and availability. More efficient and denser infrastructure allows models to respond faster, more users to be served simultaneously, and new use cases that are currently cost-prohibitive to become viable.
The Arm AGI CPU is, in that sense, an enabler — a piece of infrastructure that opens doors for the next chapter of artificial intelligence at global scale. Arm isn’t just defining the architecture of the AI-native datacenter — it’s building that architecture. And with more than 50 strategic partners and the biggest names in the industry by its side, the next chapters of this story are shaping up to be even more interesting. 🌐
