AI Agents: Optimization and Efficiency in the Cloud

AI agents are everywhere right now, and more and more applications rely on them to autonomously carry out complex tasks. They analyze videos, process data, generate code, coordinate other systems, and do all of this by chaining together multiple models and external tools to solve problems that would be too difficult for a single model to tackle alone. It is impressive, no doubt, but all that autonomy comes at a cost that does not always make the headlines.

There is a problem growing alongside all this popularity: the more sophisticated these systems become, the more energy, processing power, and money they consume, often unnecessarily. The way these highly fragmented systems are designed and deployed tends to create inefficiencies that lead to wasted computation, extra energy spending, and higher costs. Multiply that by thousands of requests per day, and the waste starts to add up, both financially and in terms of environmental impact.

That is where a piece of research that has been getting a lot of attention recently comes in. Researchers from MIT and Microsoft developed a smart system called Murakkab, an Urdu word meaning composition of things, built specifically to simplify the process of building agentic workflows and automatically optimize how they run within cloud computing environments.

The core idea is easy to understand:

Instead of a developer having to manually configure every technical detail of an agent system
They describe in plain language what they want the application to do, without needing to specify all the details upfront
Murakkab handles the rest automatically, choosing the best models, tools, and hardware configurations available
And it adjusts everything in real time based on each user’s priorities, like minimizing costs or maximizing speed

Test results were quite impressive, showing significant reductions in energy consumption and operational costs without compromising performance. That alone would be reason enough to pay attention to this research, but what makes Murakkab truly interesting is how it solves a problem that most AI agent platforms still completely overlook. 👇

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

The real problem behind agentic workflows

When you start building systems with AI agents, you quickly realize that complexity grows exponentially. An agentic workflow is a system composed of multiple autonomous agents that collaborate using different models and tools, like databases or Python programs, to dynamically complete a multi-step task. It could be data processing, code generation, or an application that analyzes a video and answers questions about it. These workflows typically operate behind the scenes, powering applications that end users interact with without ever realizing all the engineering going on underneath.

The core problem is that, in most current systems, developers have to lock in all the technical choices in the code from the very start. They need to decide which agents, models, and tools to use, in what order, what hardware will run the workflow, and how to balance trade-offs like speed versus cost. This is especially complicated because agentic workflows bring together multiple black-box models and very different tools, each with its own set of configuration options, often provided by entirely different companies.

And there is a detail that makes everything even harder: if a new AI model is released that could improve the accuracy or efficiency of the application, the developer would essentially need to start over to implement it. As researcher Gohar Chaudhry, a graduate student in electrical engineering and computer science at MIT and lead author of the study, explains, even if someone wanted to do all that configuration manually, they would have a very hard time reaching the optimal result because the space of possible configurations is simply enormous.

To make matters worse, the cloud data center deploying the application for customers cannot see inside the workflow to allocate hardware resources in the most efficient way at the exact moment of a user request. This is precisely the set of problems Murakkab was designed to address, optimizing the entire process end to end.

How Murakkab works in practice

Murakkab operates as an optimization layer that sits between the application and the cloud infrastructure. First, it allows the developer to create an agentic workflow by simply describing the application’s intent in high-level terms, without needing to detail how each component should be combined. For example, someone could describe a video question-and-answer application that extracts key frames, generates a transcript, and then answers user queries about that video.

The thing is, there are many ways to do this, and each combination of models and tools has direct implications for how fast the application can complete the task. Murakkab takes these simple specifications from the developer and automatically identifies the best existing models and tools to assemble the workflow. It also determines which components need to run sequentially and which can run in parallel to boost performance.

One of the most sophisticated parts of the system is precisely this real-time adaptability. Because Murakkab makes configuration decisions dynamically over time, if a new model or a new GPU accelerator launches tomorrow, the developer does not need to worry about any of it. When the cloud provider deploys the application for a customer, Murakkab configures the workflow components to meet the user’s constraints, such as prioritizing accuracy while respecting a latency requirement. It adaptively identifies the ideal hardware allocations and deployment schedules to maximize efficiency in real time, then generates a workflow ready for the provider to execute.

Another important benefit is that the system gives the cloud provider visibility across multiple workloads at the same time. This allows the provider to share computational resources as efficiently as possible while satisfying each user’s constraints. The intelligence of the system lies in its ability to perform this dynamic resource balancing transparently, maintaining the quality criteria defined by the user while minimizing everything unnecessary in the process.

The numbers that prove the efficiency

The benchmarks released by the researchers were quite convincing. When tested across various agentic workflows, such as video question-and-answer and code generation, Murakkab met user requirements using only about 35% of the computation demanded by other methods. It consumed only about 27% of the energy and cost less than 25% compared to traditional approaches. These are massive reductions, and the best part is they came without hurting the quality of the generated responses.

The dynamic nature of the system also allows users to balance trade-offs with considerable flexibility. In one of the tested scenarios, the system reduced a workflow’s energy consumption by more than an order of magnitude, with only about a 2% drop in accuracy for the customer. Murakkab even managed to identify an unexpectedly optimal configuration for a model that selects video frames, optimizing the performance of a question-and-answer task. This kind of optimization would be virtually impossible to do manually, according to Chaudhry. The combination of MIT and Microsoft behind the project is also a signal that this research has both academic rigor and industrial applicability. The study, by the way, will be presented at the USENIX Symposium on Operating Systems Design and Implementation.

Why this matters for the future of AI agents

The conversation around sustainability in artificial intelligence is getting increasingly serious, and for good reason. Large language models already consume staggering amounts of energy just to exist, and when you start scaling agentic workflows in production, that consumption grows in ways many companies are not yet properly accounting for. As Chaudhry himself points out, it is very easy to over-allocate resources, wasting energy and money, and enabling a cloud provider to make these workflows more efficient in a smart way is a win for everyone involved.

Tools we use daily

Translation

Text Inspection & Clipping

Productivity & Organization

Energy efficiency is no longer a secondary concern — it has become a real criterion for architectural decision-making, especially for companies with sustainability goals or those operating in markets where infrastructure costs determine whether a product is viable. Agentic workflows are quickly becoming the backbone of what cloud providers offer, and caring about how efficient they are is no longer optional.

From a technical standpoint, Murakkab represents an important paradigm shift in how we think about optimization for agentic systems. Until now, the tendency has been to solve performance problems by throwing more resources at them — using bigger models, more memory, more parallelism. What this research proposes is the opposite path: use fewer resources more intelligently, letting an orchestration system make decisions that a human would take far too long to make manually at scale. This has direct implications not only for operational efficiency but also for democratizing the use of AI agents, since more cost-effective systems become accessible to smaller teams with tighter budgets.

And there is another angle worth highlighting: the complexity of managing agentic workflows in production is one of the biggest barriers to adopting these technologies at scale today. The more that optimization work can be automated and abstracted into an intelligent layer like Murakkab, the more developers can focus on what really matters — building the business logic of their agents instead of spending time managing infrastructure.

The team’s next steps involve expanding the system to even more complex workflows and larger compute clusters, exploring opportunities to optimize new agentic applications. As Chaudhry summarizes, there is a lot of potential to make these workflows more resource-efficient so they consume far less energy, but it requires thinking about this at the scale of major cloud platforms. This research, supported in part by the Semiconductor Research Corporation and DARPA, points toward a future where cloud computing for AI becomes not only cheaper and more sustainable but also simpler to use. And that is the kind of advancement with real potential to change how the industry builds these applications in the years ahead. 🚀

AI Agents: Optimization and Efficiency in the Cloud

Index

Receive the best innovation content in your email.

The real problem behind agentic workflows

How Murakkab works in practice

The numbers that prove the efficiency

Why this matters for the future of AI agents

Tools we use daily

Rafael

CONTACT
US

Related publications

Amazon's stock could rise following OpenAI partnership.

Moratorium on AI Data Centers: Energy in Debate

Blockchain and AI Agents Are Changing Crypto Payments

Receba o melhor conteúdo de inovação em seu e-mail

START

PRODUCTS

SERVICES

RESOURCES

Rafael

Website Pricing Calculator

Website Pages

Website Features

Visitors per month

Marketing Automation

What is the site industry?

Calculator Result

AI Agents: Optimization and Efficiency in the Cloud

Index

Receive the best innovation content in your email.

The real problem behind agentic workflows

How Murakkab works in practice

The numbers that prove the efficiency

Why this matters for the future of AI agents

Tools we use daily

Rafael

CONTACTUS

Related publications

Amazon's stock could rise following OpenAI partnership.

Moratorium on AI Data Centers: Energy in Debate

Blockchain and AI Agents Are Changing Crypto Payments

Receba o melhor conteúdo de inovação em seu e-mail

Rafael

Website Pricing Calculator

Website Pages

Calculator Result

Fale com um consultor

CONTACT
US