01/04/2026 10 minutos de leituraPor Rafael

Share:

Data challenges persist as companies accelerate AI adoption

Data challenges remain one of the biggest obstacles for anyone trying to scale artificial intelligence for real inside organizations. And this is not a problem that is shrinking over time — quite the opposite, it is becoming more apparent as more companies try to get AI working at scale.

A new joint report from Snowflake and Omdia shed some really interesting light on this landscape: 79% of organizations face multiple technical and data-related challenges when trying to move forward with AI. And the most striking finding is not even that number. It is the fact that, despite all those obstacles, more than 90% of those same companies are already using data to train AI models.

In other words, the race for AI adoption is happening right now, whether or not the data infrastructure is where it needs to be. This raises a question worth understanding better: is using data the same thing as having data reliable enough for AI to reason well? The short answer is no. And that is exactly where the problem most companies have not solved yet lives.

What is actually holding companies back

When the Snowflake and Omdia report points out that nearly 8 out of 10 organizations face multiple technical and data challenges when trying to scale AI, that is no surprise to anyone working in the field. What is surprising is the speed at which companies keep pushing forward anyway. The competitive pressure to adopt AI is so intense that many organizations are building solutions on top of a data foundation that is not yet ready to support that level of complexity. It is like trying to build a 30-story building on ground that has not been properly prepared.

The three problems most frequently cited by companies in the study are well known to data and engineering teams:

  • Breaking down AI data silos — cited by 65% of respondents as challenging
  • Measuring and monitoring AI data quality — flagged by 62% as challenging or very challenging
  • Preparing data to be AI-ready — also cited by 62% of participants

These problems are not new. They have existed for years. But when you add an AI layer on top of them, errors get amplified in ways that can compromise an entire strategy. A model trained on bad data does not deliver bad results in an obvious way — it delivers bad results in a convincing way, and that is far more dangerous.

Another point the report highlights is the speed at which teams need to move. Business leaders are demanding AI results on tighter and tighter timelines, while technical teams know that resolving data silos, improving pipelines, and ensuring traceability takes time. This tension between speed and quality is real, and it sits at the center of most AI projects that stall or underdeliver.

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

Baris Gultekin, VP of AI at Snowflake, summed up this apparent contradiction well: companies are not waiting until everything is perfectly clean and ready, because they simply cannot afford to. But using data is not the same thing as having usable context. The model does its job, but it is reasoning over an incomplete or inconsistent picture of the business.

Data silos: the silent enemy of AI

Data silos come up again and again as one of the biggest villains of AI adoption in the enterprise. And it makes complete sense. When data is trapped in isolated systems — whether CRMs, ERPs, marketing platforms, or legacy databases — AI simply cannot see the full picture. It reasons based on fragments, and fragments lead to incomplete conclusions. For language models and generative AI systems, this is especially critical, because they depend on broad context to generate useful and accurate responses.

The silo problem is not purely technical — it is also organizational. Different teams build and maintain their own data repositories with distinct logic, varying formats, and no standardization between them. When the time comes to consolidate everything to feed an AI model, the cleanup, transformation, and integration work can be monumental. And that work is often underestimated in the initial project planning, which leads to delays, frustrations, and eventually AI results that fall well below the expectations that were set.

Solving data silos requires more than technology. It requires a cultural shift within organizations, with different teams willing to share data, standardize processes, and accept that data ownership does not belong exclusively to a single department. Platforms like Snowflake itself were built with this goal in mind — centralizing data access without necessarily moving everything into a single location, but creating a unified access layer that allows AI to work with information from multiple sources in a coherent and secure way.

Data quality: the foundation AI needs to work

There is a phrase that gets thrown around a lot among data professionals: garbage in, garbage out. It sums up pretty well what happens when data quality is ignored in an AI project. If the input data is inaccurate, outdated, duplicated, or inconsistent, the model will learn wrong patterns and reproduce them at scale. The result is an AI that seems to work, that responds with confidence, but that is systematically wrong in ways that can have real business impact.

The report reinforces that 40% of respondents identified data quality as a primary concern, and 62% said that measuring and monitoring that quality is challenging or very challenging. These are numbers that show most organizations know they have a problem, but have not yet found an efficient way to solve it at the pace AI adoption demands.

High-quality data requires continuous processes for monitoring, validation, cleaning, and enrichment. It requires well-built pipelines, proper cataloging, and well-documented metadata. All of that costs time and resources, and in many organizations those investments were postponed for years because legacy systems worked well enough for traditional reporting and analytics needs. Now, with AI demanding a much higher level of data reliability, those technical debts are coming due with interest.

The good news is that the market is responding with increasingly sophisticated tools for data observability, automated anomaly detection, and real-time governance. Modern data platforms already offer native capabilities to track data lineage, identify quality issues before they reach the model, and ensure teams have visibility into what is being used to train and feed AI. But adopting these tools also requires organizational maturity, and that is where many companies are still in the early stages. 🐢

The numbers show the problem is getting worse

An important detail from the report that cannot be overlooked is that these difficulty percentages have increased compared to the previous year. In other words, known problems are persisting and, in some cases, getting worse. This indicates that the pace of AI adoption is growing faster than companies’ ability to resolve their structural data issues.

When it comes to how ready data actually is for AI, the numbers are revealing. Only 32% of respondents consider their structured data as AI-ready. For unstructured data, that number drops to 20%, with only 7% stating that at least half of their unstructured data was ready. And that number fell compared to the previous year, which shows a concerning trend.

Unstructured data — such as documents, emails, audio, video, and free-form text — makes up the majority of data generated by companies today. If the vast majority of that data is not ready to feed AI models, there is a massive bottleneck between what companies want to do with AI and what they can actually deliver reliably.

Legacy systems and interoperability also weigh in

Beyond quality and silo issues, the report identifies other significant barriers. Interoperability problems were cited by 42% of respondents, which makes sense when you consider that many organizations operate with dozens of different tools and platforms that do not communicate well with each other. The ability to provide real-time data was also flagged as a challenge by 42% of participants.

A particularly impactful finding relates to legacy systems: 55% of respondents said their legacy systems are incompatible with modern AI requirements, compared to 38% in other industries. This shows that certain sectors are significantly further behind in modernizing their data infrastructure, which creates a real competitive disadvantage when it comes to harnessing the potential of artificial intelligence.

Even so, companies are using data to train AI

Despite all this lack of readiness and the abundance of challenges, the adoption numbers are impressive. 92% of respondents said they are using their data to train or augment LLMs (large language models). Of those, 95% are using structured or semi-structured data, and 91% are using unstructured data.

These numbers show that AI adoption is not waiting for data infrastructure to be perfect. Companies are pushing forward because the cost of waiting seems greater than the cost of moving ahead with imperfect data. And to a certain extent, that makes sense — AI projects can reveal data problems that would never be discovered through static analysis, and hands-on learning has enormous value.

Tools we use daily

But there is a real risk in this approach. As Gultekin explained, early success is happening, but scaling is hard because the context layer is not fully built yet. The companies that will stand out will not be the ones chasing better models, but the ones that can get their data into a place where AI can reason reliably. This is a crucial distinction that separates AI projects generating sustainable value from those stuck in promising proofs of concept that never make it to production at scale.

Being AI-ready goes beyond having data available

One of the most important takeaways from this report is that being AI-ready does not simply mean having data stored somewhere accessible. It means having data that is reliable, traceable, well-documented, governed, and integrated in a way that AI can use it with enough context to generate real value. That is a significantly higher bar than most organizations can meet today, and recognizing that is the first step toward moving forward on more solid ground.

The fact that over 90% of companies are already using data to train AI models, even without solving their structural data issues, shows that the pressure for results is outpacing technical caution. This is not necessarily a mistake, since learning by doing has value and AI projects can reveal gaps that would not surface otherwise. But it is a calculated risk that needs to be managed with awareness. Companies that push ahead without a clear strategy for evolving data quality and governance tend to reach a point where accumulated problems make it difficult to scale or trust the results AI generates.

The smartest path forward seems to be a balance between advancing AI projects and, at the same time, investing in data maturity in a parallel and ongoing way. It does not need to be perfect before you start, but it does need to improve as you go. Organizations that can maintain this balance are the ones that, over the medium term, will reap the real benefits of AI adoption without being blindsided by structural failures that undermine trust in the systems they have built. 🚀

What to expect going forward

The Snowflake and Omdia report serves as an important reminder: data challenges do not disappear just because AI has arrived. They become more visible, more urgent, and more costly to ignore. Companies that understand this sooner will come out ahead not just in the speed of adoption, but in the quality and sustainability of the results they can generate with artificial intelligence.

The current landscape shows an industry that is learning by doing, making predictable mistakes, and gradually understanding that the real competitive advantage in AI is not in the most advanced model or the most expensive tool. It is in the ability to build and maintain a solid, governed, and continuously improved data foundation. The organizations that lock in this context layer, as Gultekin described, will be the ones that turn potential into real and scalable results. And this race, by all indications, is still just getting started. 💡

Picture of Rafael

Rafael

Operations

I transform internal processes into delivery machines — ensuring that every Viral Method client receives premium service and real results.

Fill out the form and our team will contact you within 24 hours.

Related publications

Amazon's stock could rise following OpenAI partnership.

Amazon and OpenAI partnership could boost AI revenue and stock value, says Citi; strategic impact on AWS and infrastructure race.

Moratorium on AI Data Centers: Energy in Debate

Sanders and AOC propose moratorium on AI datacenter construction in the US to assess environmental and energy impacts.

Blockchain and AI Agents Are Changing Crypto Payments

AI agents power crypto payments with blockchain, stablecoins and x402, enabling autonomous transactions, micropayments and machine-to-machine economy

Receba o melhor conteúdo de inovação em seu e-mail

Todas as notícias, dicas, tendências e recursos que você procura entregues na sua caixa de entrada.

Ao assinar a newsletter, você concorda em receber comunicações da Método Viral. A gente se compromete a sempre proteger e respeitar sua privacidade.

Rafael

Online

Atendimento

Calculadora Preço de Sites

Descubra quanto custa o site ideal para seu negócio

Páginas do Site

Quantas páginas você precisa?

4

Arraste para selecionar de 1 a 20 páginas

📄

⚡ Em apenas 2 minutos, descubra automaticamente quanto custa um site em 2026 sob medida para o seu negócio

👥 Mais de 0+ empresas já calcularam seu orçamento

Fale com um consultor

Preencha o formulário e nossa equipe entrará em contato.