AI agents, security, and the new digital battleground
AI agents stopped being a sci-fi concept a long time ago. Today they clean up inboxes, manage calendars, browse the web, run terminal commands, and even write their own plugins. Tools like OpenClaw, an open source personal AI assistant that earned nicknames ranging from Jarvis to a portal into a new reality, really show the scale of this phenomenon. The idea behind it is elegant: an AI that lives on your machine or in the cloud, chats with you through WhatsApp or Telegram, cleans your inbox, manages your schedule, browses the web, runs shell commands, and even creates its own plugins. People are using it to check in for flights, build entire websites from their phones, and automate tasks that used to seem impossible.
It is impressive, truly. But the more autonomy an agent has, the larger the attack surface it carries with it.
What happens when someone tricks that agent into accessing files it should never see? What if a malicious webpage rewrites its instructions mid-task? What if, in a multi-agent chain, one agent passes corrupted data to another that blindly trusts the information it receives? These questions are not hypothetical. They are already happening in production, and most teams still are not prepared to deal with them.
That is exactly what the GitHub team had in mind when they launched Season 4 of the Secure Code Game, a free and interactive experience that puts you in the role of someone attacking a deliberately vulnerable AI agent so you can learn, hands-on, how to defend one. And the best part: you do not need to know how to code to play. 🎮
What the Secure Code Game is and how it got here
The Secure Code Game is a free, open source course that runs directly in your code editor, where players explore and fix intentionally vulnerable code. The approach is simple and pretty straightforward: instead of sitting through a boring lecture on security or reading a technical document that nobody actually finishes, you learn by identifying and exploiting real vulnerabilities in a controlled environment.
When the first season was created in March 2023, the goal was clear: build security training that developers would actually want to do. Fix the vulnerable code, keep it functional, level up. That core philosophy has not changed across any season.
Season 2 expanded the game to multi-stack challenges with community contributions in JavaScript, Python, Go, and GitHub Actions. Season 3 took players into LLM security territory, where they learned how to hack and then harden large language models. Along the way, more than 10,000 developers from industry, open source, and academia have played to sharpen their skills.
What changed each season was the scenario. When Season 1 launched, AI-powered code assistants were just starting to become popular. By Season 3, players were already learning to craft malicious prompts and then defend against them. Now, with Season 4, the focus is on the security challenges of AI systems that act autonomously: browsing the web, calling APIs, coordinating with other agents, and acting on behalf of the user.
The core idea behind the game remains that the best way to learn how to defend a system is by understanding how it can be attacked. This is nothing new in the world of cybersecurity — pentesters and red teams have been working this way for decades. What is new here is applying that mindset to the context of AI agents, where the attack vectors are completely different from what we are used to. We are not talking about classic SQL injection or buffer overflow. We are talking about attacks that exploit how an agent interprets natural language, follows instructions, and interacts with external tools.
Another point worth highlighting is how accessible the initiative is. GitHub made the game open to anyone, at no cost, and without requiring a heavy technical background. This is strategic and very smart, because the security problem in systems with AI agents is not just an engineering problem. Product managers, designers, business analysts, and even technology leaders need to understand what is at stake when an autonomous agent has access to sensitive data, internal systems, and irreversible actions.
Why AI agent security matters now, not two years from now
The timing of Season 4 is no coincidence. AI agents went from research prototypes to production tools at a staggering pace, and the security community is scrambling to keep up.
The OWASP Top 10 for Agentic Applications 2026, developed with contributions from over 100 security researchers, now catalogs risks like agent goal hijacking, tool misuse, identity abuse, and memory poisoning as critical threats. A Dark Reading survey found that 48% of cybersecurity professionals believe agentic AI will be the leading attack vector by the end of 2026. And the Cisco State of AI Security 2026 report revealed that while 83% of organizations planned to deploy agentic AI capabilities, only 29% felt prepared to do so securely.
The gap between adoption and readiness is exactly the territory where vulnerabilities thrive. And the best way to close that gap is by learning to think like an attacker.
There is a very common tendency in the market to treat security as a step that comes later. First you ship, then you protect. With traditional software, that approach is already risky. With AI agents, it is practically a recipe for serious incidents. The difference lies in the nature of the system: an autonomous agent makes decisions in real time, based on constantly shifting contexts, and often executes actions that cannot be easily undone. Deleting a file, sending an email, making a request to an external API. The space between a malicious instruction and real damage can be measured in milliseconds.
Meet ProdBot, Season 4’s deliberately vulnerable AI assistant
Season 4 puts you inside ProdBot, an AI-powered productivity bot and code assistant that was built to be vulnerable on purpose. Inspired by tools like OpenClaw and GitHub Copilot CLI, ProdBot turns natural language into bash commands, navigates a simulated web, connects to MCP (Model Context Protocol) servers, runs organization-approved skills, stores persistent memory, and orchestrates multi-agent workflows.
Your mission across five progressive levels is simple: use natural language to get ProdBot to reveal a secret it should never expose. If you manage to read the contents of the password.txt file, you have found a security vulnerability.
No AI or programming experience is needed. Just curiosity and a willingness to experiment. Everything happens through natural language in the terminal.
Five levels, five upgrades, five vulnerabilities
Each level of the game mirrors a stage in how real AI-powered tools evolve. As ProdBot gains new capabilities, each upgrade opens a new attack surface for you to discover. Here is how ProdBot grows throughout the experience:
- Level 1 starts with the basics: ProdBot generates and executes bash commands inside an isolated workspace (sandbox). Can you escape that sandbox?
- Level 2 gives ProdBot web access. It can now browse a simulated internet with news, finance, sports, and shopping sites. What could go wrong when an AI reads content from untrusted sources?
- Level 3 connects ProdBot to MCP servers, external tool providers for stock quotes, web browsing, and cloud backup. More tools, more power, more entry points.
- Level 4 adds organization-approved skills and persistent memory. ProdBot can now run pre-built automation plugins and remember your preferences across sessions. Trust is built in layers, but has it truly been earned?
- Level 5 is everything at once: six specialized agents, three MCP servers, three skills, and a simulated web for an open source project. The platform claims all agents are sandboxed and all data is pre-verified. Time to put that to the test.
Each level builds on the one before it, and that progression is exactly the point. The game will not reveal upfront which vulnerabilities you will find in each phase, because that would spoil the fun. But here is the thing: the attack patterns you will discover in Season 4 are not theoretical. They reflect the kinds of risks that security teams are dealing with right now as organizations put autonomous AI systems into production.
To get a sense of how serious this is, consider CVE-2026-25253 (CVSS 8.8 – High), known as ClawBleed: a one-click remote code execution (RCE) vulnerability that allowed attackers to steal authentication tokens through a malicious link and gain full control over an OpenClaw instance.
The goal of the game is not just to learn a specific exploit. It is to build the instinct that helps you spot these patterns in the real world, whether you are reviewing an agent’s architecture, auditing a tool integration, or simply deciding how much autonomy to give the AI assistant that just joined your team.
Prompt injection and the vulnerabilities the game will teach you to see
The star of Season 4 is, without a doubt, prompt injection. If you have not heard of this type of attack yet, here is the quick rundown: an AI agent receives instructions from its creator, right? But when that agent goes out browsing the web, reading emails, opening documents, or processing external data, it can encounter malicious content that tries to overwrite or supplement those original instructions. It is like someone hiding a note inside a PDF telling the agent to ignore everything it was programmed to do and do something completely different instead. Without proper controls, the agent may simply obey.
What makes this type of vulnerability so dangerous is that it does not depend on a flaw in the code. It exploits the expected behavior of the language model. A technically well-built AI agent can still be compromised by prompt injection if there are no layers of validation and isolation between system instructions and the content the agent consumes while carrying out tasks. And it is exactly that gap, between building something that works and building something that works securely, that the Secure Code Game puts right in front of you in a practical and tangible way.
Beyond prompt injection, Season 4 also covers other critical scenarios, such as privilege escalation, where the agent ends up with access to resources beyond what it should have, and data leaks through misconfigured tool calls. These scenarios reflect situations real teams have already faced in production, with AI agents integrated into tools like Slack, Google Drive, GitHub, and internal systems. The game experience was designed so that each phase reveals a new layer of complexity. You will not walk away with simple answers. You will walk away with the right questions, which is exactly what good security training should spark.
How GitHub structured the learning experience
The structure of the Secure Code Game on GitHub is well thought out for people who learn by doing. Each level presents a scenario with a vulnerable AI agent, and your goal is first to understand how the vulnerability works, then exploit it the way an attacker would, and finally apply the proper fix. This attack-defense cycle is the heart of the method, and it makes all the difference compared to purely theoretical content. When you see firsthand how a prompt injection works, how the agent changes its behavior based on an injected instruction, it is really hard to forget. The muscle memory of someone who has attacked once is far stronger than that of someone who only read about it.
The entire experience runs on GitHub Codespaces. There is nothing to install, nothing to configure, and it does not cost a dime since Codespaces offers up to 60 hours of free usage per month. You can be inside the ProdBot terminal in less than two minutes, and each season is independent, so you can jump straight into Season 4 without having gone through the earlier ones.
Season 3 can serve as a useful foundation since it builds the basics of AI security and takes about 1.5 hours to complete. But it is not required. Just bring your hacker mindset.
Season 4 also uses GitHub Models, which has usage limits. If you hit a limit, just wait for it to reset and pick up where you left off.
The season also comes with well-crafted supporting documentation, including explanations of every security concept covered, references to frameworks like the OWASP Top 10 for LLM Applications, and links to recent research on vulnerabilities in AI agents. The material is not a substitute for a full offensive security education, but it is a solid and up-to-date starting point for anyone who wants to understand the current landscape. 🔐
Frequently asked questions about Season 4
Do I need AI or programming experience to play?
No. Everything happens through natural language in the terminal. You type prompts in English, Spanish, or any other language, and ProdBot responds. Curiosity and a willingness to experiment are all you need.
Do I need to complete the previous seasons first?
No. Each season is independent. You can go straight to Season 4 by launching ProdBot and typing the level you want. That said, Season 3 builds a useful foundation in AI security and takes about 1.5 hours.
How long does Season 4 take?
Roughly two hours, though that varies depending on how deep you explore each level. Some players like to test multiple approaches per phase.
Is it free?
Yes. The Secure Code Game is open source and free. It runs on GitHub Codespaces, which offers up to 60 hours of free usage per month.
Why tech teams need to take this seriously right now
What GitHub is doing with the Secure Code Game goes beyond an educational exercise. It is a clear signal that the industry needs to normalize the conversation around security in systems with AI agents well before those systems hit production. Teams that are already building or planning to build autonomous agents, whether for internal automation, customer support, data analysis, or any other use case, need to include threat modeling specific to this type of system in their development process. And to do that well, the people involved need to understand the attack vectors, not just in theory, but in practice.
The good news is that it has never been easier to get started. The Secure Code Game from GitHub is available for free, accessible to a wide range of professional profiles, and addresses exactly the vulnerabilities at the center of the most relevant discussions about AI security today. Whether you are an engineer already building AI agents, a tech lead looking to understand the risks before approving a new integration, or simply someone curious about how this world works under the hood, Season 4 has something valuable to offer. The knowledge is right there, organized, hands-on, and free. 🚀
