Share:

AI Agents in Silicon Valley: between wasted tokens and chaotic systems

AI agents have become the darling of C-level meetings around the world. Excited CEOs, slides packed with promises, and the feeling that any company not adopting this technology is going to fall behind. But while executives celebrate, the people on the front lines — engineers and developers — are telling a very different story.

This week, two separate events in Silicon Valley shed light on what is actually happening behind the scenes with AI agents: unstable systems, operational costs that nobody puts in the pitch deck, and a level of complexity that can turn the solution into an even bigger problem. 😬

From San Jose to Mountain View, engineers from Google, Amazon, Microsoft, and Meta, along with startup founders, were surprisingly honest about the challenges that still need to be solved before this technology actually delivers on what it promises.

And OpenClaw, the tool Jensen Huang called the next ChatGPT, is right at the center of this conversation — both as a protagonist and as a target of harsh criticism from people trying to make everything work in real-world enterprise environments.

The wasted tokens problem

One of the most memorable moments of the week happened during the Generative AI and Agentic AI Summit in San Jose. Kevin McGrath, CEO of AI startup Meibel, got straight to the point when describing what he considers the biggest problem the industry is facing right now with AI agents.

According to McGrath, there is a misguided mindset that absolutely everything needs to be processed by a large language model, the well-known LLMs. The metaphor he used really paints the picture: companies are basically handing over all their tokens and all their money to bots that simply waste millions and millions of tokens for no reason.

The core issue is not that language models are bad. It is that not every task needs to go through an LLM. Many operations can be handled with conventional logic, simple business rules, or lighter-weight tools. When you throw everything at a generative model to solve, the result is an infrastructure bill that skyrockets without the output quality improving proportionally. McGrath argued that companies need to be much more deliberate in deciding which tasks are truly suited for AI agents and which are better served by traditional approaches.

That statement resonated strongly with the engineers in the room. The narrative that reaches executive decks is always the same: the AI agent will automate complex tasks, reduce response times, eliminate manual steps, and of course, cut costs. It sounds irresistible. But when you sit down with the engineering teams that are actually building and maintaining these systems, the picture changes quite a bit.

Complexity bordering on chaos

If the wasted tokens issue was one of the dominant themes, the operational complexity of AI agents was the other big protagonist of the week.

Deep Shah, a software engineer at Google, led a session focused on new techniques for managing the operational costs of running large numbers of AI agents simultaneously. Shah was clear when listing the challenges: when you try to scale a multi-agent system, the first obstacle that shows up is inference cost. And it is not small.

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

Running AI agents costs money. A poorly designed or poorly monitored system for tracking these digital assistants and their actions can end up burning cash instead of saving it. This is not theory — it is the reality that engineering teams at Google and the DeepMind unit are dealing with every single day.

Ravi Bulusu, CEO of startup Synchtron, expanded on this discussion by pointing to the root of the problem: interdependent complexity. He explained that when you look at a real company, there are multiple dimensions at play — how data is organized, which technology platforms are used, how software is built and maintained, and how the workforce is structured.

Running AI agents significantly touches all of these dimensions at the same time. And as Bulusu put it quite bluntly: no single dimension solves the problem, and the interdependencies are what makes all of this difficult — actually, chaotic.

That word — chaotic — kept echoing through the hallways of the event. And it did not come from an outside critic. It came from someone building tools for this ecosystem.

OpenClaw: revolution for some, headache for others

OpenClaw entered this week’s conversations with the energy of a product that everyone wants to understand but few can actually explain properly. Since its recent emergence, OpenClaw has positioned itself as a kind of harness — a layer that lets developers use multiple AI models to create and manage fleets of digital assistants. The tech industry quickly embraced the tool and started pushing AI agents as the next big evolution.

NVIDIA CEO Jensen Huang told CNBC journalist Jim Cramer back in March that OpenClaw is definitely the next ChatGPT. A heavyweight statement that naturally generated even more enthusiasm around the technology.

But on Thursday, during an AI event held in Mountain View, California, OpenClaw received a much more sober assessment. The event featured participation from ThinkingAI and MiniMax, both headquartered in Shanghai, China.

ThinkingAI recently went through a rebranding. Previously known as ThinkingData, a company focused on analytics for mobile games, it repositioned itself as an AI agent management platform. As part of that transformation, ThinkingAI partnered with MiniMax, which went public in Hong Kong in January. MiniMax is one of China’s most important AI labs, having released powerful models for free to the open-source community and establishing itself as one of the country’s so-called AI Tigers.

Chris Han, co-founder of ThinkingAI, explained that the shift to AI agent management is part of an effort to expand beyond the gaming sector and reach other industries excited about agents but still lacking the necessary technical expertise.

And it was Han who delivered the most direct assessment of OpenClaw. Despite the tool’s growing popularity in China, he stated that it is too complicated and prone to security vulnerabilities for enterprise use.

In his words: OpenClaw is a good tool for personal use, but it definitely cannot reach the enterprise level. To get to enterprise grade, you need to solve a lot of issues — memory, how to manage your agents, teams, communications. There are too many pieces that need to fit together.

That statement carries weight because it comes from someone building a competing product, but also from someone who works directly with the practical pain points of implementation. The criticism is not aimed at the concept itself, but at the tool’s readiness for environments where failures can have real and serious consequences.

The geopolitical dimension of Chinese models

Another topic that inevitably came up during the Mountain View event was the geopolitical dimension. With Chinese companies like ThinkingAI and MiniMax increasingly present in the global AI ecosystem, the question about potential U.S. government restrictions on Chinese AI models hovered over the discussions.

Chris Han chose not to comment on possible national security concerns involving Chinese AI models that could impact ThinkingAI’s strategy. However, he made a point of highlighting that the company’s service also supports models from companies like OpenAI and Google, which signals a strategy of flexibility and adaptation to different regulatory landscapes.

In a lighthearted moment, Han joked that if the U.S. government decided to ban Chinese open-weight AI models in the country, he would take it as a positive sign. If that happens, maybe it means we are succeeding, he said, drawing laughs from the audience. 😄

Behind the joke, there is a relevant strategic point. The AI race is not just about technology — it is also a battle for influence and market standards. Companies that can offer the flexibility to work with models from different origins are better positioned to navigate whatever regulatory landscape emerges.

Operational costs: the elephant in the room

If there was one theme that cut across every panel and hallway conversation this week, it was operational costs. And it is not hard to see why. 💸

When a company decides to put AI agents into production, it rarely sizes up the total cost of that decision correctly. The cost of the language model itself is just the tip of the iceberg. Underneath it you will find:

  • Context storage and memory costs
  • API calls to external tools
  • Monitoring and observability infrastructure
  • Engineering time to maintain and fine-tune prompts and configurations
  • Ongoing testing to make sure agent behavior has not degraded after updates
  • Rework costs and end-customer impact in case of failures

All of that combined can turn a seemingly cheap technology into one of the biggest cost centers in an operation. Deep Shah from Google was quite emphatic in placing inference cost as the first obstacle that shows up at scale, but it is only the beginning of a long list.

Tools we use daily

McGrath’s observation about wasted tokens gains even more weight in this context. Every unnecessary call to an LLM is money thrown away. Multiply that by thousands of daily executions, across hundreds of agents running simultaneously, and the waste can reach alarming levels. The complexity of getting these systems to work reliably in real-world contexts is still massively underestimated by many sales tools and platform vendors.

The most relevant point raised throughout the week was the need for a ROI evaluation model specifically designed for AI agents. That model needs to account not only for the direct savings generated by automation, but also for the cost of failure, the cost of human oversight required to validate agent outputs, and the cost of continuously adapting the system as the business context changes. Without that, companies will keep making decisions based on lab benchmarks that do not reflect production reality.

Reliability and the unpredictability of agents

A point raised repeatedly at both events was the issue of reliability. Unlike traditional software, where expected behavior is deterministic, AI agents operate with a considerable dose of unpredictability. They can make unexpected decisions, interpret instructions in ways the developer never anticipated, or simply get stuck in a reasoning loop with no exit.

Dealing with these scenarios requires a robust observability layer and fallback mechanisms. And that layer alone already represents a significant technical project and cost. It is not something you bolt on later as a patch — it needs to be in the architecture from day one.

The engineers from Google and DeepMind who participated in the San Jose summit made it clear that building and operating AI agents is not a simple task. The C-level excitement needs to be tempered with the technical reality of the people actually making these things work day in and day out. Without that alignment, the risk of projects failing or blowing past budgets is enormous.

What lies ahead

This week’s events left a clear impression: artificial intelligence applied to autonomous agents is not a false promise, but it is also not a plug-and-play solution for any operation without friction. The maturation is happening, the tools are evolving, and discussions like the ones that took place in Silicon Valley are exactly the kind of conversation the industry needs to have more often — with less hype and more honest engineering.

The strongest message that stuck was Kevin McGrath’s: companies need to be more deliberate when deciding what truly deserves to be processed by an AI agent and what can be solved in simpler, cheaper ways. It sounds obvious, but in the middle of collective euphoria, the obvious is frequently ignored.

OpenClaw shows up as an interesting piece on this board, but it carries limitations that still need to be addressed for serious enterprise use. ThinkingAI and MiniMax represent an increasingly relevant Chinese front in the global agent ecosystem, and how geopolitics will shape this market remains a wild card.

The practical takeaway for anyone following this market closely is that operational costs and the complexity of AI agents need to be part of the conversation from day one of planning, not after the project is in production and problems start popping up. The companies that manage to balance technological ambition with implementation rigor are the ones that will turn the potential of AI agents into real competitive advantage — without any unpleasant surprises along the way. 🚀

Picture of Rafael

Rafael

Operations

I transform internal processes into delivery machines — ensuring that every Viral Method client receives premium service and real results.

Fill out the form and our team will contact you within 24 hours.

Related publications

Amazon's stock could rise following OpenAI partnership.

Amazon and OpenAI partnership could boost AI revenue and stock value, says Citi; strategic impact on AWS and infrastructure race.

Moratorium on AI Data Centers: Energy in Debate

Sanders and AOC propose moratorium on AI datacenter construction in the US to assess environmental and energy impacts.

Blockchain and AI Agents Are Changing Crypto Payments

AI agents power crypto payments with blockchain, stablecoins and x402, enabling autonomous transactions, micropayments and machine-to-machine economy

Receba o melhor conteúdo de inovação em seu e-mail

Todas as notícias, dicas, tendências e recursos que você procura entregues na sua caixa de entrada.

Ao assinar a newsletter, você concorda em receber comunicações da Método Viral. A gente se compromete a sempre proteger e respeitar sua privacidade.

Rafael

Online

Atendimento

Website Pricing Calculator

Find out how much the ideal website for your business costs

Website Pages

How many pages do you need?

Drag to select from 1 to 20 pages

In just 2 minutes, automatically find out how much a custom website for your business costs

More than 0+ companies have already calculated their quote

Fale com um consultor

Preencha o formulário e nossa equipe entrará em contato.