AI took the blame for the school bombing in Iran, but the truth is far more disturbing
On the first morning of Operation Epic Fury, February 28, 2026, American forces struck the Shajareh Tayyebeh elementary school in Minab, southern Iran, hitting the building at least twice during morning classes. Between 175 and 180 people died, the vast majority girls between the ages of 7 and 12.
Within days, the question that organized all the news coverage was whether Claude, a chatbot developed by Anthropic, had selected the school as a target. The United States Congress wrote to Secretary of Defense Pete Hegseth, questioning the extent of artificial intelligence use in the strikes. The New Yorker asked whether Claude could be trusted to follow orders in combat, whether it might resort to blackmail as a self-preservation strategy, and whether the Pentagon’s primary concern should be the fact that the chatbot had a personality. Almost none of this had any connection to reality. The targeting system behind Operation Epic Fury ran on a platform called Maven. Nobody was talking about Maven. 🎯
What Maven is and why nobody was looking at it
Eight years before the tragedy in Minab, Maven was the most contested project in Silicon Valley. In 2018, more than 4,000 Google employees signed a letter opposing the company’s contract to build artificial intelligence for the Pentagon’s targeting systems. Workers organized protests. Engineers quit. And Google eventually walked away from the contract.
The company that stepped in was Palantir Technologies, the data analytics firm and defense contractor co-founded by Peter Thiel. Over the following six years, Palantir transformed Maven into a targeting infrastructure that pulls together satellite imagery, signals intelligence, and sensor data to identify targets and shepherd them through every stage, from first detection to strike order.
The building in Minab had been classified as a military facility in a Defense Intelligence Agency database that, according to CNN, had not been updated to reflect that the building had been separated from the adjacent Islamic Revolutionary Guard Corps compound and converted into a school. Satellite imagery, as reported by NPR, shows this change had occurred by 2016 at the latest. A chatbot did not kill those children. People failed to update a database, and other people built a system fast enough to make that failure lethal.
By the time the Iran war began, Maven had become part of military infrastructure, invisible like a pipe inside a wall, and the entire conversation revolved around Claude. This obsession works as a kind of collective psychosis about AI, though not the kind we usually discuss. It affects critics and technology enthusiasts with equal intensity. You don’t need to use a language model to let it organize your attention or distort your thinking.
The power of a charismatic technology
In 2019, researcher Morgan Ames published The Charisma Machine, a study of how certain technologies attract attention, resources, and credit toward themselves, diverting all of it from everything else. The usual framework for understanding this dynamic is the concept of hype, but hype only describes what enthusiasts do and assigns critics a privileged debunking role that still keeps the technology at the center of the entire discussion. A charismatic technology shapes the whole field around it, the way a magnet organizes iron filings. Large language models may be the most powerful instance of this phenomenon in history.
By the time the war started, terms like AI safety, alignment, hallucination, and stochastic parrots had already become the mandatory vocabulary of any debate about artificial intelligence, structuring and limiting what could be said. Worse still, the term artificial intelligence itself had become synonymous with LLMs. When the school was bombed, that was the toolkit people reached for, even though this critical apparatus was inadequate for the older, more mature set of technologies involved in military targeting.
The real question, the one almost nobody was asking, has nothing to do with Claude or any language model. It is a bureaucratic question about what happened to the kill chain. And the answer is Palantir. 👇
The kill chain: from French artillery to Palantir software
In military language, kill chain is a remarkably honest term. At its core, it refers to the bureaucratic framework that organizes the steps between detecting something and destroying it. The earliest reference to the term itself comes from the 1990s, but the idea is quite old, going back at least to the 1760s, when French artillery reformers began replacing the experienced eye of the gunner with ballistic tables, elevation screws, and standardized firing procedures.
The steps in the kill chain are subject to constant change, to keep up with shifts in targeting doctrine and also to incorporate whatever management fads happen to afflict military strategic thinkers. The U.S. military has named and renamed the steps for 80 years. In World War II, the sequence was find, fix, fight, finish. By the 1990s, the Air Force had stretched it to find, fix, track, target, engage, assess, or F2T2EA. Every generation of military technology was sold with the promise of making everything in kill chains shorter, except the acronyms.
Palantir’s Maven Smart System is the latest iteration of this compression, and it grew out of a shift in strategic thinking during Obama’s second term. In 2014, Secretary of Defense Chuck Hagel and his deputy, Robert Work, announced the so-called third offset strategy. An offset in this logic is a bet that a technological advantage can compensate for a strategic weakness the country cannot address directly.
The first two offsets tackled the same problem: the United States could not match the Soviet Union in conventional forces. Nuclear weapons, the first offset, made the personnel advantage irrelevant in the 1950s. When the Soviets achieved nuclear parity in the 1970s, precision-guided munitions and stealth technology offered the promise that a smaller force could defeat a larger one. By 2014, that edge was eroding. China and Russia had spent two decades acquiring guided munitions and building defense systems designed to keep American forces out of range. Robert Work insisted the third offset was not about any specific technology, but about using technology to reorganize how the military operated, allowing the U.S. to make decisions faster than China and Russia.
From drone to analyst: the information overload problem
In April 2017, at the start of the first Trump administration, Work helped establish the Algorithmic Warfare Cross-Functional Team, designated Project Maven. One of the generals overseeing Maven, Lieutenant General Jack Shanahan, put the problem bluntly: thousands of intelligence analysts were spending 80% of their time on mundane tasks, drowning in surveillance drone footage that nobody had time to watch. A single Predator drone mission could generate hundreds of hours of video. The central premise of the project was that the machine could watch so the analyst could think.
The Pentagon needed someone to build it. Google took the contract, and what happened next became the most visible labor action in Silicon Valley history. After Google walked away from the contract, Palantir picked it up in 2019.
Scarlet Dragon: from tabletop exercise to 1,000 targets per hour
The XVIII Airborne Corps began testing the system in an exercise called Scarlet Dragon, which started in 2020 as a tabletop wargame in a windowless basement at Fort Bragg. Its commander, Lieutenant General Michael Erik Kurilla, wanted to build what he called the first AI-enabled corps in the Army. The goal was to test whether the system could give a small team the targeting capacity that previously required thousands of people.
Over the next five years, Scarlet Dragon grew into a live-fire military exercise spanning multiple states and branches of the armed forces, with engineers from Palantir and other contractors embedded alongside soldiers. Each time the exercise ran, it was supposed to answer the same question: how fast could the system go from detection to decision?
The benchmark was the 2003 invasion of Iraq, where roughly 2,000 people worked the targeting process for the entire war. During Scarlet Dragon, 20 soldiers using Maven handled the same workload. By 2024, the stated goal was 1,000 targeting decisions per hour. That works out to 3.6 seconds per decision or, from the individual operator’s perspective, one decision every 72 seconds. 😳
How the Maven interface actually works
The Maven interface looks like a military version of corporate project management software crossed with a mapping app. What the military analyst sees is a map with intelligence data layers or a screen organized into columns, each representing a stage of the targeting process. Individual targets move across the columns from left to right as they advance through each stage, in a format borrowed from Kanban, the lean manufacturing workflow system developed at Toyota and widely used in software development.
Before Maven, operators worked across eight or nine separate systems simultaneously, pulling data from one, cross-referencing in another, manually moving detections between platforms. Maven consolidated all of that into a single interface. Cameron Stanley, the Pentagon’s chief digital and AI officer, called it an abstraction layer, a common software engineering term meaning a system that hides the complexity beneath it.
Humans drive the targeting. Beneath the interface, machine learning systems analyze satellite imagery and sensor data to detect and classify objects, scoring each identification by how confident the system is that it got it right. Three clicks convert a data point on the map into a formal detection and move it into the targeting pipeline. The system recommends how to strike each target, which aircraft, drone, or missile to use, which weapon to pair, and the officer selects from the ranked options.
The AI beneath the interface is not a language model. The core technologies are the same basic systems that recognize your cat in a photo library or allow a self-driving car to combine camera, radar, and lidar into a single picture of the road, applied here to drone footage, radar, and satellite imagery of military targets. They predate large language models by years. Neither Claude nor any other LLM detects targets, processes radar, fuses sensor data, or pairs weapons to targets.
LLMs are late additions to the Palantir ecosystem. In late 2024, years after the core system was operational, Palantir added an LLM layer, which is where Claude fits in, allowing analysts to search and summarize intelligence reports in natural language. But the language model was never what mattered in this system. What mattered was what Maven did to the targeting process: it consolidated the systems, compressed the time, and reduced the people.
The Vietnam lesson: when the system can only measure itself
This is not a new idea. The U.S. military has been trying to close the gap between seeing something and destroying it for as long as that gap has existed, and every attempt has produced the same kind of failure.
In the late 1960s, the U.S. faced a version of the same problem in Vietnam. Supplies moved south along the Ho Chi Minh Trail through jungle the military could not see into. The solution was Operation Igloo White, a billion-dollar-a-year program that scattered 20,000 acoustic and seismic sensors along the trail. Those sensors transmitted data to relay aircraft, which fed the signals to IBM 360 computers at Nakhon Phanom air base in Thailand. The computers analyzed the data and predicted where convoys would be, and strike aircraft were directed to those coordinates.
The system could sense but could not see. It could detect a vibration but could not tell a truck from an ox cart. The North Vietnamese figured it out. They played recordings of truck engines, drove animals near the sensors to trigger vibration detection, and hung buckets of urine in trees to trip the chemical detectors. The Air Force claimed 46,000 trucks were destroyed or damaged over the course of the campaign. The CIA reported that the claims for a single year exceeded the total number of trucks believed to exist in all of North Vietnam. The system’s own output was the only measure of its performance, and nobody outside had the authority to challenge it. When daytime reconnaissance flights failed to find the wreckage of all those trucks, Air Force personnel invented a creature to explain the absence. They called it the great Lao truck eater.
Precision in fire, imprecision in intelligence
The pattern that unfolded in Vietnam, a targeting system that could only measure its own performance and ended up believing its own output, is actually older than digital computing. Historian Michael Sherry, in his 1987 book The Rise of American Air Power, traced the pattern back to the founding doctrine of precision bombing, whose confidence in its own methods made it unnecessary to examine what those methods actually produced.
Carl von Clausewitz, the 19th-century Prussian general whose writings remain the foundation of Western military thought, had a word for everything optimization leaves out. He called it friction: the accumulation of uncertainty, error, and contradiction that ensures no operation goes as planned. But friction is also where judgment forms. Clausewitz observed that most intelligence is false, that reports contradict each other. The commander who has worked through it learns to see the way an eye adjusts to darkness, not by getting better light, but by staying long enough to use what light there is.
That staying is what takes time. Compress the time and the friction does not disappear. You just stop noticing it.
When speed kills: the Iraq 2003 precedent
The 2003 invasion of Iraq, the operation Scarlet Dragon would later use as its benchmark, was a textbook case. Marc Garlasco, the Pentagon’s chief of high-value targeting during the invasion, ran the fastest targeting cycle the U.S. had ever executed. He recommended 50 strikes against senior Iraqi leadership. The bombs were precise, they hit exactly where they were aimed, but the intelligence behind them was not. None of the 50 strikes killed its intended target.
Two weeks after the invasion, Garlasco left the Pentagon for Human Rights Watch, went to Iraq, and stood in the crater of a strike he had personally designated as a target. The targeting cycle had been fast enough to hit 50 buildings and too fast to discover it was hitting the wrong ones.
Jon Lindsay, who served as a Navy intelligence officer in Kosovo and later studied special operations targeting in Iraq, found something revealing. Once a target had been reified into a PowerPoint slide, the target intelligence package became a black box. Questioning the premises behind it got harder as the hunt gained momentum, as the folder thickened with what Lindsay calls representational residue. There was more machinery for building a target than for inspecting the quality of its construction.
During the air war in Kosovo, General Wesley Clark demanded 2,000 targets. The CIA nominated only one target during the entire war: the federal directorate for supply and procurement. Analysts had an address but not coordinates, so they tried to reverse-engineer the location from three outdated maps. They ended up hitting the Chinese embassy, which had recently moved, 300 meters from the building they were aiming at. The State Department knew the embassy had moved. The military facilities database did not. Lindsay called it circular reporting: an accumulation of supporting documents that created the illusion of multiple validations while amplifying a single error. Lindsay, writing in his diary at the time, called the result an immense error, perfectly packaged.
The British effect: when slowness saved lives
In 2005, Lieutenant Colonel John Fyfe of the U.S. Air Force published a study on time-sensitive targeting during the 2003 invasion. Fyfe highlighted the different approaches of British and American forces. At the Combined Air Operations Center, RAF officers served in leadership positions alongside their American counterparts, operating under more restrictive rules of engagement.
Fyfe noted that their more reserved and conservative personalities produced what he called a very positive dampening effect on the sometimes hasty and chaotic pace of offensive operations. The contrast between shifts was visible: American leaders pushed full throttle, while British officers methodically reconsidered risks and cost-benefit trade-offs before approving strikes. During UK-led shifts, there were no friendly fire incidents and no significant collateral damage.
From within the efficiency framework, every characteristic Fyfe describes was logged as a defect. The British shifts were slower. The restrictive rules of engagement added constraints. The dampening effect added time. Speed saves lives, goes the argument, but the fastest targeting cycle before Maven was Garlasco’s, and it hit 50 buildings without striking a single intended target. Scarlet Dragon stripped all of that away: the disagreements over targeting, the deliberation, the hesitation, and the moments when someone had time to object or notice something was wrong.
The bureaucratic dilemma: when human judgment becomes a column in software
Organizations that run on formal procedure need someone inside the process to interpret rules, notice exceptions, recognize when the categories no longer fit the case. If the organization admits that its outcomes depend on the discretion of the people executing them, then the procedure is not a procedure but a suggestion, and the authority the organization derives from appearing rule-governed collapses. So judgment has to happen and has to look like something else. It has to look like following procedure rather than interpreting it.
Historian of science Theodore Porter argued in his 1995 book, Trust in Numbers, that organizations adopt quantitative rules not because numbers are more accurate, but because they are more defensible. Judgment is politically vulnerable. Rules are not. Procedure exists to make discretion disappear, or appear to disappear.
In 1984, historian David Noble showed that when the American military and manufacturers automated their factory floors, they consistently chose systems that were slower and more expensive but moved decision-making away from workers and into management. The point was not efficiency — it was often extremely wasteful — but control.
Alex Karp, Palantir’s CEO, describes exactly this achievement in his 2025 book, The Technological Republic. He writes that software is now in charge, with hardware serving as the means by which AI recommendations are implemented in the world. His model for how this should work comes from nature: swarms of bees and murmurations of starlings. No mediation of the information captured by scouts when they return to the hive, Karp writes. No weekly reports to middle management, no presentations for more senior leaders, no meetings to prepare for other meetings.
That sounds liberating, even utopian. But the signal that passes without mediation is also the signal nobody can question.
Karp thinks he is destroying bureaucracy. He is actually encoding it. The contempt for meetings and weekly reports — he treats that as the bureaucratic process itself. It is not. That was where people interpreted procedure, the place where someone could notice when the categories no longer fit the case. What Karp eliminated was the discretion the institution could never admit it depended on. What remains is a bureaucracy that can execute its rules but with nobody to interpret them. Bureaucracy encoded in software does not bend. It shatters. 💥
The target package that looked like every other one
The target package for the Shajareh Tayyebeh school listed a military facility. Lucy Suchman, whose 1987 book Plans and Situated Actions remains the sharpest analysis of how formal procedures obscure the work that actually produces their results, would not have been surprised. Plans always look complete afterward. They achieve that completeness by filtering out everything that was not legible to their categories.
That package looked like every other package in the queue. But outside the package, the school appeared in Iranian business listings. It was visible on Google Maps. A search engine could have found it. Nobody searched. At 1,000 decisions an hour, nobody was going to search. A former senior government official asked the obvious question: the building had been on a target list for years, and yet this went unnoticed, and the question is how.
Congress did not authorize this war. In two weeks, American forces struck 6,000 targets. The school was one of them. American forces killed nearly 200 people, and news coverage reached for the label of AI error, which domesticated the event into something a better algorithm or better safeguards could have prevented.
What the Claude debate is hiding
In the days after the strike, the charisma of AI organized the entire political conversation around technology: whether Claude hallucinated, whether the model was aligned, whether Anthropic bore responsibility for its deployment. The constitutional question of who authorized this war and the legal question of whether this strike constitutes a war crime were displaced by a technical question that is easier to ask and impossible to answer in the terms in which it was framed. The Claude debate absorbed the energy. That is what charisma does.
And it also obscured something deeper: the human decisions that led to the deaths of between 175 and 180 people, most of them girls between the ages of 7 and 12. Someone decided to compress the kill chain. Someone decided that deliberation was latency. Someone decided to build a system that produces 1,000 targeting decisions per hour and call them high quality. Someone decided to start this war. Hundreds of people are sitting in the Capitol, refusing to stop it.
Calling this an AI problem gives those decisions, and those people, a place to hide. 🕊️
An earlier version of this article was published on Artificial Bureaucracy, Kevin T Baker’s Substack.
