Sandboxing for AI agents just got 100 times faster with Cloudflare’s latest bet
Artificial Intelligence has reached a point where agents don’t just think — they also write and execute code in real time. That changes everything, but it raises a question nobody can ignore: where does this code run safely?
The most common answer today is containers, but they have a serious performance and cost problem that starts to hurt as scale increases. Hundreds of milliseconds to boot up and hundreds of megabytes of memory just to function end up creating a real bottleneck in production environments handling multiple agents simultaneously.
Cloudflare jumped into this conversation with a solution that goes well beyond what the market was used to seeing, and the result is impressive: a sandboxing environment that can be up to 100 times faster than traditional container-based approaches. 🚀
In this article, you’ll learn how the Dynamic Worker Loader works, why TypeScript has become the go-to language for AI agents tackling this kind of task, which helper libraries come with the ecosystem, how companies are already using this technology, and what all of this actually means for anyone building products with AI-generated code today.
The problem nobody wanted to admit
For a long time, the tech industry treated the execution of code generated by Artificial Intelligence as an implementation detail — something that would come later, once the model was good enough. What actually happened was the opposite: the models got really good really fast, and the infrastructure to run that code safely fell behind. Today, any reasonably capable AI agent can write a working script in seconds, but putting that script into production without creating a security risk is still a real challenge for most companies.
Simply using an eval() on the AI-generated code directly in the application is completely out of the question. A bad actor could easily trick the model into injecting vulnerabilities into the code. That’s why the concept of a sandbox is so central: an isolated place to execute code, completely separated from the main application and the rest of the world, except for the specific capabilities the code needs to access.
Docker containers emerged as the default answer to this problem, and for good reasons: they isolate the environment, control resources, and offer a reasonable layer of security. The problem is that containers need time to boot up, consume considerable memory even when idle, and when you’re dealing with dozens or hundreds of simultaneous executions of AI-generated code, the operational cost scales in a scary way. Cloudflare itself acknowledges this scenario by offering its container runtime and Sandbox SDK, but also points out that for agents at consumer scale — where every end user might have one or several agents, and each agent writes code — containers simply aren’t enough.
There’s also a more subtle issue that often flies under the radar: the latency perceived by the end user. When an AI agent finishes generating a piece of code and the system needs to wait a few seconds to spin up a container before executing it, that wait completely breaks the sense of fluidity that makes the AI experience so powerful. The user loses context, trust in the product drops, and the tool’s value proposition starts getting questioned. Solving sandboxing isn’t just a technical issue — it’s a product experience issue.
Where the Dynamic Worker Loader came from
The story starts back in September of last year, when Cloudflare introduced the concept of Code Mode: the idea that AI agents should accomplish tasks not by making individual tool calls, but by writing code that calls APIs directly. The company demonstrated that simply converting an MCP server into a TypeScript API managed to reduce token usage by 81%. On top of that, they showed that Code Mode could operate both in front of and behind an MCP server, resulting in Cloudflare’s new MCP server that exposes the company’s entire API with just two tools and fewer than 1,000 tokens.
Hidden in that same September announcement was an experimental feature: the Dynamic Worker Loader API. This API lets a Cloudflare Worker spin up a new Worker, in its own sandbox, with code specified at runtime — all dynamically. Now, this feature has graduated from experimental status and entered open beta, available to all paid Workers users.
How the Dynamic Worker Loader changes the game
The Dynamic Worker Loader is the core technology behind the sandboxing solution Cloudflare presented, and it works quite differently from what we’re used to seeing. Instead of spinning up a full container for each execution, the system uses isolated Workers that are loaded dynamically, leveraging Cloudflare’s distributed network infrastructure to create extremely lightweight and fast execution environments. Each Worker runs in its own isolated context, sharing no memory or state with other Workers, which guarantees the level of isolation needed for safe execution of code generated by Artificial Intelligence.
The technical magic here lies in the isolation mechanism used. While traditional containers rely on operating system-level virtualization, the Dynamic Workers model uses isolates — instances of Google’s V8 JavaScript execution engine, the same one that powers Chrome. This is the same mechanism that has underpinned the entire Cloudflare Workers platform since its launch eight years ago. Each isolate is essentially a separate V8 context, with its own memory heap and no access to the external environment unless explicitly allowed.
The numbers are impressive: an isolate takes just a few milliseconds to start and uses only a few megabytes of memory. That represents roughly 100 times more speed and between 10 and 100 times more memory efficiency than a typical container. In practice, this means it’s perfectly feasible to create a new isolate for each user request on demand, execute a single piece of code, and discard it right after — with zero concerns about cost or performance.
The practical flow works like this: the AI agent generates the code, usually in TypeScript or JavaScript; that code is sent to the sandboxing system; the Dynamic Worker Loader creates an isolated context in milliseconds; executes the code within that context with pre-defined permissions and resource limits; returns the result; and discards the Worker. This entire cycle can happen in tens of milliseconds, compared to several seconds in the container-based model. For products that depend on real-time code execution, this difference is transformative. 🔥
Scalability without artificial limits
Many container-based sandbox providers impose global limits on concurrent sandboxes and sandbox creation rates. The Dynamic Worker Loader doesn’t have those limits. It doesn’t need them, because it’s simply an API for the same technology that has always powered the Cloudflare platform — which has always allowed Workers to scale transparently to millions of requests per second.
Want to process a million requests per second where each individual request loads a separate Dynamic Worker sandbox, all running simultaneously? No problem.
Zero-latency communication
Single-use Dynamic Workers typically run on the same machine, and even the same thread, as the Worker that created them. There’s no need to reach across the globe to find a warm sandbox. Isolates are so lightweight they can simply run wherever the request landed. Dynamic Workers are supported in each of Cloudflare’s hundreds of locations around the world.
Why TypeScript became the lingua franca of AI agents
If you’ve been following the development ecosystem in recent years, you’ve probably noticed that TypeScript has gradually taken over the space JavaScript used to hold as the default language for web development. What might not be as obvious is that this shift is also clearly reflected in how Artificial Intelligence agents behave when it comes to code generation.
Cloudflare is pretty straightforward about this: technically, Workers — including dynamic ones — support Python and WebAssembly, but for small code snippets generated on demand by an agent, JavaScript and TypeScript load and execute much faster. And while we humans have strong preferences about programming languages, AI agents don’t. LLMs are experts in all major languages, and their training data in JavaScript is massive. Plus, JavaScript, by its web-native nature, was designed to run in a sandbox. It’s the right language for the job.
The reason for the preference for TypeScript specifically is more technical than it might seem. TypeScript’s type system provides a layer of static checking that allows both humans and automated systems to validate the correctness of a piece of code before even executing it. When an AI agent generates code in Python or plain JavaScript, type errors only show up at runtime, meaning the sandboxing layer has to deal with unexpected failures reactively. With TypeScript, a significant portion of those issues can be caught before the code ever reaches the isolated execution environment.
TypeScript consumes fewer tokens than OpenAPI
If we want our agent to do something useful, it needs to communicate with external APIs. The question is: how do we tell the agent about the APIs it can access? MCP defines schemas for simple tool calls, but not for programming APIs. OpenAPI offers a way to express REST APIs, but it’s verbose both in the schema and in the code needed to call it.
For APIs exposed to JavaScript, there’s one answer: TypeScript. A TypeScript interface describing a chat room API, for example, can be expressed concisely in just a few lines, while the equivalent OpenAPI spec is so long you need to scroll to see the whole thing. Fewer tokens mean lower inference costs and better comprehension by the model — for agents and humans alike.
The Dynamic Worker Loader makes it easy to implement a TypeScript API in your own Worker and pass it to the Dynamic Worker as a method parameter or on the env object. The Workers runtime automatically sets up an RPC bridge between the sandbox and the host code, so the agent can invoke your API across the security boundary without even realizing it’s not using a local library.
HTTP filtering and credential injection
For those who prefer to provide HTTP APIs to agents, support is comprehensive. Using the globalOutbound option of the worker loader API, you can register a callback that gets invoked on every HTTP request. In that callback, you can inspect the request, rewrite it, inject authentication keys, respond directly, block it, or do whatever else is needed.
This enables credential injection: when the agent makes an HTTP request to a service that requires authorization, credentials are automatically added on the way out. This way, the agent never knows the secret credentials and therefore can’t leak them. That said, Cloudflare emphasizes that, absent a compatibility requirement, RPC interfaces in TypeScript are superior to HTTP because they require fewer tokens, are easier to restrict, and simpler to secure.
Battle-tested security
Securing an isolate-based sandbox isn’t trivial. While all sandboxing mechanisms have bugs, security flaws in V8 are more common than in typical hypervisors. When using isolates to sandbox potentially malicious code, additional layers of defense in depth are essential.
Cloudflare has nearly a decade of experience securing its isolate-based platform. The company’s systems apply V8 security patches in production within hours — faster than Chrome itself. The security architecture includes a custom second-layer sandbox with dynamic tenant isolation based on risk assessments. The company extended V8’s own sandbox to take advantage of hardware features like MPK, collaborated with researchers to develop innovative Spectre defenses, and has systems that scan code for malicious patterns, automatically blocking them or applying additional sandboxing layers.
When you use Dynamic Workers on Cloudflare, all of that security infrastructure comes for free. 🛡️
Helper libraries that make life easier
Cloudflare built a set of libraries to simplify working with Dynamic Workers, and each one is worth knowing about.
Code Mode SDK
The @cloudflare/codemode package simplifies executing AI model-generated code against tools using Dynamic Workers. At the center is DynamicWorkerExecutor(), which builds a tailored sandbox with code normalization to handle common formatting errors and direct access to a globalOutbound fetcher to control fetch behavior inside the sandbox.
The SDK also provides two server-side utility functions: one that wraps an existing MCP server by replacing its tool surface with a single code tool, and another that, given an OpenAPI spec and an executor, builds a complete MCP server with search and execution tools — better suited for larger APIs. In both cases, the model-generated code runs inside Dynamic Workers.
Bundling with @cloudflare/worker-bundler
Dynamic Workers expect pre-bundled modules. The @cloudflare/worker-bundler package handles this automatically: you provide source files and a package.json, and it resolves npm dependencies from the registry, bundles everything with esbuild, and returns the module map the Worker Loader expects. It also supports full-stack applications, bundling a server Worker, client-side JavaScript, and static assets together, with built-in asset serving that handles content types, ETags, and SPA routing.
File handling with @cloudflare/shell
The @cloudflare/shell package gives your agent a virtual file system inside a Dynamic Worker. The agent’s code calls typed methods on a state object, including read, write, search, replace, diff, glob, query and update JSON, and archiving — with structured inputs and outputs instead of string parsing.
Storage is backed by a durable Workspace built on SQLite and R2, so files persist between executions. Operations like multi-file searches and batch replacements minimize RPC round-trips. Batch writes are transactional by default: if any write fails, previous ones are automatically rolled back.
Who’s already using it and how
Real-world use of the Dynamic Worker Loader is already happening across varied and pretty interesting scenarios.
Code Mode in production
Developers want their agents to write and execute code against tool APIs instead of making sequential tool calls one at a time. With Dynamic Workers, the LLM generates a single TypeScript function that chains multiple API calls, executes it in a Dynamic Worker, and returns the final result back to the agent. Only the result — not each intermediate step — goes into the context window. This reduces both latency and token usage and produces better results, especially when the tool surface is large.
Cloudflare’s own MCP server was built exactly this way: it exposes the entire Cloudflare API through just two tools — search and execute — in fewer than 1,000 tokens, because the agent writes code against a typed API instead of navigating hundreds of individual tool definitions.
Custom automations
Zite, for example, is building an app platform where users interact through a chat interface. The LLM writes TypeScript behind the scenes to build CRUD apps, connect to services like Stripe, Airtable, and Google Calendar, and run backend logic — all without the user ever seeing a line of code. Each automation runs in its own Dynamic Worker, with access only to the specific services and libraries that endpoint needs.
According to Antony Toron, CTO and co-founder of Zite, the company needed an execution layer that was instant, isolated, and secure — and Dynamic Workers hit all three requirements, outperforming every other platform they evaluated in speed and library support. Zite now processes millions of execution requests daily thanks to Dynamic Workers.
AI-generated applications
Developers are also building platforms that generate complete applications from AI, whether for their customers or for internal teams building prototypes. With Dynamic Workers, each app can be spun up on demand and then put into cold storage until it’s invoked again. Fast startup times make it easy to preview changes during active development, and platforms can block or intercept any network request the generated code makes.
How much does all of this cost
Dynamic Workers are billed at $0.002 per unique Worker loaded per day, on top of the usual CPU time and invocation pricing for regular Workers. For Code Mode use cases where each Worker is single-use, that means $0.002 per Worker loaded plus CPU and invocations — a cost that’s typically negligible compared to the inference costs of generating the code.
During the beta period, the $0.002 charge is waived. Since prices can change, it’s always worth checking the official Dynamic Workers pricing documentation for the latest info.
What this actually means for anyone building with AI
Speaking very concretely: if you’re building any product where an Artificial Intelligence agent needs to execute code as part of its workflow — whether it’s a data analysis assistant, an automation tool, a development copilot, or any similar application — the way you solve the sandboxing problem will directly impact your user experience and your infrastructure costs.
The Dynamic Worker Loader approach opens up a third path that didn’t exist in an accessible way before: sandboxing that’s fast, secure, and priced proportional to actual usage, without the need to maintain complex container infrastructure. For startups and small teams, this is especially relevant because it removes a significant technical barrier that previously required specialized engineers or expensive third-party solutions. For larger companies, the performance gains and cost reduction at scale can represent a real competitive advantage, especially in products where code execution speed is part of the core value proposition.
It’s also important to mention that this isn’t a magic solution that fixes every security problem related to executing code generated by Artificial Intelligence. Sandboxing solves execution isolation, but validating the code before execution, setting limits on access to external resources, and defining policies for what can and can’t be executed still need to be figured out by the team building the product. The Dynamic Worker Loader provides the infrastructure, but the complete security architecture still depends on design decisions that go beyond choosing an execution platform. What changes is that now the hardest part — fast and reliable isolation — is already solved in a pretty elegant way. ⚡
The combination of TypeScript, V8 isolate-based sandboxing, and the Dynamic Worker Loader represents one of the most practical and immediate evolutions for anyone building with Artificial Intelligence today — and it’s well worth keeping an eye on how this ecosystem develops over the coming months.
