Mistral Medium 3.5 and remote coding agents: the cloud takes on the heavy lifting
Mistral Medium 3.5 just dropped with a feature that is about to seriously change the routine for anyone working in software development.
Mistral AI just announced remote coding agents, and the pitch is pretty straightforward: take the processing off your local machine and move everything to the cloud. If you have ever lost time waiting for an agent to finish a task in your terminal, your entire workflow frozen in the meantime, this update was built to solve exactly that problem.
The core idea is simple: coding sessions now run independently, in parallel, and notify you when they are done. You can kick everything off through the Mistral Vibe CLI or directly from Le Chat, without ever leaving the conversation. And the engine powering all of this is Mistral Medium 3.5 itself, a dense model with 128 billion parameters and a 256,000-token context window, built to handle long, complex tasks without losing the thread.
Beyond the remote agents, Mistral is also launching a new Work mode in Le Chat, which further expands what you can do with the assistant on multi-step tasks. 🚀 In the sections ahead, we will break down each of these new features, how they work in practice, and what they mean for the day-to-day of anyone building software or focused on productivity.
What remote coding agents are and why they matter
Anyone who has used AI agents to automate development tasks knows that one of the biggest bottlenecks is local processing. Your computer becomes hostage to the agent while it works, and anything else you need to do in parallel takes a hit. Remote coding agents are here to end that dependency, moving all execution to cloud servers and freeing up your machine to keep running normally while the heavy lifting happens somewhere else.
In practice, the workflow changes significantly. You fire off a task, whether it is a code refactor, creating automated tests, or even building an entire module, and the agent starts working asynchronously without locking up anything on your end. When the session wraps up, you get a notification and can review what was done. This is especially valuable for development teams that need speed and cannot afford to wait for a local process to chew through resources for minutes on end, dragging down the entire team’s productivity.
Cloud-based execution also opens the door to something that was practically impossible before: running multiple sessions in parallel. Imagine kicking off three different refactors at the same time, each one running independently without interfering with the others. This is not just a convenience thing, it is a real shift in how AI-assisted development can fit into the daily routine of teams working with short sprints and frequent releases. The time you used to spend waiting becomes time you can spend reviewing, planning, or tackling another task entirely.
How the remote agent workflow works
The mechanics behind remote agents are designed to be transparent and controllable. While sessions run in the cloud, you can inspect what the agent is doing in real time, viewing file diffs, tool calls, progress states, and even questions the agent might ask during execution. It is not a black box: you maintain full visibility over every step of the process.
One interesting detail is that local sessions started through the CLI can be teleported to the cloud. This means that if you started working in the terminal and need to step away, you do not lose anything. The session history, current task state, and pending approvals are transferred to the remote infrastructure, and the agent picks up right where it left off. When it is ready, it opens a pull request on GitHub and notifies you. You review the final result, not every keystroke along the way.
Each coding session runs in an isolated sandbox, which means broad edits and dependency installations happen without any risk of affecting other processes or environments. This isolation is critical for ensuring security and predictability, especially in enterprise settings where multiple developers might be spinning up agents at the same time.
Integrations that make a real difference day to day
Mistral Vibe does not operate in a vacuum. It connects with the tools engineering teams already use every day, keeping a human in the loop where it matters. The integrations include:
- GitHub for code and pull requests
- Linear and Jira for issue management
- Sentry for incident monitoring
- Slack and Teams for notifications and reports
This web of integrations turns the remote agent into something far more than a code generator. It becomes an active participant in the team’s workflow, capable of picking up an issue, investigating the problem, proposing a fix, and opening the corresponding PR, all autonomously and fully traceable. The kind of work that fits well into this model includes module refactors, test generation, dependency updates, CI failure investigations, and well-defined bug fixes. 🔧
The model behind it all: Mistral Medium 3.5
Mistral Medium 3.5 is the engine powering all of these capabilities, and understanding what it is helps put this update into perspective. We are talking about a dense model with 128 billion parameters and a 256,000-token context window, which means it can process and maintain coherence across documents, codebases, and extremely long conversations without losing track. For anyone working on medium to large-scale software projects, this makes a huge difference because the agent can understand the full context of what is being done before it acts.
Dense models, unlike mixture-of-experts models, apply all of their parameters on every inference. This tends to result in more consistent responses and better reasoning on tasks that demand attention to detail, which is exactly what coding requires. Mistral positioned this model specifically to handle complex multi-step tasks where the agent needs to make chained decisions and maintain context throughout the entire execution. It is a very different profile from smaller, faster models that are great for simple answers but start stumbling when a task requires deeper reasoning.
Performance and benchmarks
The benchmark numbers help put Mistral Medium 3.5 in perspective. The model achieved 77.6% on SWE-Bench Verified, coming in ahead of Devstral 2 and models like Qwen3.5 397B A17B. On agentic capabilities, the results are equally impressive, with 91.4 on τ³-Telecom, a benchmark designed to evaluate model performance on autonomous, complex tasks.
Another relevant aspect is that the model’s reasoning effort is configurable per request. This means the same Mistral Medium 3.5 can answer a quick chat question without burning unnecessary compute and, on the very next call, operate with deep reasoning to solve a complex agentic task. This flexibility matters both for optimizing costs and for adapting the model’s behavior to the type of task at hand.
The vision encoder was also trained from scratch to handle images of varying sizes and aspect ratios, which expands the possibilities beyond pure text. For anyone working with interfaces, architecture diagrams, or visual documentation, this can be quite useful in workflows that combine image analysis with code generation.
Availability and pricing
Mistral Medium 3.5 is already available as the default model on Le Chat and in the Mistral Vibe CLI, replacing Devstral 2 as the coding agent backbone. The model was released as open weights under a modified MIT license, with weights published on Hugging Face. This allows it to be self-hosted with just four GPUs, making local deployment viable for organizations that need to keep data and processing within their own infrastructure.
Through the API, pricing is set at $1.5 per million input tokens and $7.5 per million output tokens. It is also available for prototyping on NVIDIA GPU-accelerated endpoints at build.nvidia.com and as a containerized inference microservice via NVIDIA NIM. 💰
Le Chat and Work mode: productivity on another level
Le Chat has always been Mistral’s conversational assistant, but with the launch of Work mode, it gains a whole new layer of functionality that goes way beyond answering questions or generating code snippets. Work mode transforms Le Chat into an environment where multi-step tasks can be planned, executed, and tracked within the conversation interface itself, without needing to switch between different tools or open separate terminals.
In practice, this means you can, for example, describe what you need done in plain language, and Le Chat, powered by Mistral Medium 3.5 under the hood, will orchestrate the necessary steps, call remote agents when needed, process the results, and present you with a summary of what was accomplished. It is a workflow much closer to how people naturally think about work: you define the goal, not every micro-step of the process. This reduces cognitive load and makes using AI much more accessible even for people who do not have deep technical experience with prompts and agent configurations.
What Work mode lets you do today
Work mode launches with a well-defined set of capabilities that show the kind of problems it was designed to solve:
- Cross-tool workflows: consolidate information from email, messages, and calendar in a single execution, or prepare meeting context with participant data, relevant news, and agenda items pulled from your sources.
- Research and synthesis: dive deep into a topic by cross-referencing information from the web, internal documents, and connected tools, then generate a structured report that can be edited before exporting or sending.
- Triage and actions: organize your inbox, draft replies, create Jira issues from discussions with the team or clients, and send summaries via Slack.
Work mode sessions persist longer than a typical chat response. This allows the agent to keep working across multiple steps, going through trial and error until it completes what was requested. Connectors stay active by default instead of being manually selected, giving the agent access to documents, email inboxes, calendars, and other systems to get the context it needs to take the right actions.
Transparency is also a central concern. Every agent action is visible: you see each tool call and the logic behind the decisions. And before executing sensitive tasks, like sending a message, writing a document, or modifying data, Le Chat asks for explicit approval based on your permissions. 🔒
The architecture connecting it all: Workflows and Mistral Studio
One aspect worth highlighting is how Mistral connected all of these pieces together. The company uses Workflows orchestrated in Mistral Studio to integrate Mistral Vibe with Le Chat. This infrastructure was originally built for Mistral’s own internal development environment, then expanded to enterprise customers, and is now being opened up to the general public.
This means the same technology that companies were already using in corporate environments is now accessible to anyone who wants to kick off coding tasks from the web. And without being tied to a local terminal, a developer can run multiple sessions in parallel much more easily. Sessions started in Le Chat use the same remote runtime as the CLI and web interface, ensuring consistency regardless of how the task was initiated.
What actually changes for software developers
For anyone in the trenches of day-to-day development, the changes this set of updates brings are pretty concrete. The main win is the elimination of workflow blocking. Today, when you use a local agent for a time-consuming task, you have to wait. With remote agents running in the cloud, that wait time is no longer a blocker and becomes free time instead. You start the task, keep working on something else, and when the agent finishes, you come back to review. This asynchronous cycle is much more aligned with how software development already works in terms of code reviews, pull requests, and CI/CD pipelines.
Another relevant point is process scalability. Larger teams can benefit significantly from the ability to run parallel sessions without competing for local resources. A developer can fire off multiple tasks at the same time, and each one runs independently on Mistral’s infrastructure, with no impact on the local machine and no interference between sessions. This opens up possibilities for more sophisticated workflows where different parts of a project can be worked on simultaneously by different agents, all coordinated by the same model and the same interface.
And beyond the code itself, Work mode in Le Chat has the potential to impact tasks adjacent to development as well, like documentation, requirements analysis, test generation, and spec reviews. All of these are time-consuming tasks that can be delegated to an agent capable of keeping the entire project context in mind, thanks to Mistral Medium 3.5’s 256,000-token context window. The end result is an assistant that understands the project as a whole, not just the snippet you pasted into the conversation. That is a real qualitative leap.
Available plans and how to get started
Remote coding agents and Le Chat’s Work mode are available on Mistral’s Pro, Team, and Enterprise plans. Mistral Medium 3.5 is already the default model in both Le Chat and the Mistral Vibe CLI, so anyone already using these tools will notice the change automatically.
For those who want to explore the model on a more technical level, the open weights on Hugging Face and the option to self-host with four GPUs make Mistral Medium 3.5 a competitive choice for both individual experimentation and deployment in enterprise environments that require full control over infrastructure. The combination of strong benchmark performance, a generous context window, and accessible API pricing positions this model as a serious contender in the large language model market. ✅
