Thought Leaders

Data Teams Are Dead, Long Live Data Teams

Published December 3, 2025

Sergio Gago, Chief Technology Officer, Cloudera

Yes, the title is clickbaity and provocative, but as a CTO with many years in data, I’ve witnessed a transformation that justifies the drama. The traditional “data team” – the back-office crew crunching reports and dashboards – is effectively dead. In its place, a new kind of data team is emerging: an AI-first, product-driven powerhouse with direct revenue impact. They are no longer a cost center, but a profit-generating group.

The Journey from Business Intelligence to Machine Learning

Not long ago, data teams were synonymous with business intelligence (BI). We were the historians of company data, living in SQL and spreadsheets, tasked with answering “What happened last quarter?” As big data technologies like Hadoop emerged and the term “data scientist” became the new sexy job, data teams evolved. By the mid-2010s, we were doing more than reporting; we ventured into data visualization and interactive analytics, producing dynamic dashboards for every department. The job was about data wrangling, mixing datasets from disparate sources and shapes, and trying to understand domain knowledge.

Then the late 2010s brought the machine learning era. Data teams began hiring data scientists to build predictive models and uncover insights in vast datasets. We shifted from describing the past to predicting the future: churn models, recommendation engines, demand forecasts – you name it. But even then, our outputs were slide decks and insights, not live products. We functioned as an internal service bureau, advising the business through analysis. In other words, we were cost centers – valuable, yes, but one step removed from core product and revenue.

In the best cases, machine learning teams were dispersed into separate units or embedded within product groups, so that their models and inferences could be fully integrated into platforms. The great divide led to numerous failed projects, sunk investments, and lost opportunities.

GenAI: From Support Function to Profit Center

Then GenAI arrived and everything changed. The release of powerful large language models, such as the GPT family and open-source variants like Llama, upended the landscape virtually overnight. Suddenly, data teams were not just analyzing the business, but instead became integral to building AI products and experiences. When you successfully integrate an LLM into a customer-facing application or an internal workflow, you’re no longer just informing the business; you’re driving it. A well-implemented GenAI system can automate customer support, generate marketing content, personalize user experiences, or even provide the data necessary to inform and train emerging agentic AI systems. These capabilities directly affect revenue streams. In effect, the data team’s work product has shifted from PowerPoint slides to live AI-powered applications.

GenAI teams began with innovation groups, delivering proofs of concept that generated “wow factor.” And soon enough, everyone was an AI engineer, spreading shadow IT across organizations.

Data teams soon found themselves facing a new question: “When will you become a profit center?” As AI engineers started creating amazing tools, it was clear that the time was ripe to merge two teams: those who controlled the data and those who built the applications.

Consider a retail company that deploys a GenAI chatbot for handling sales inquiries, or a bank that launches an AI-driven, personalized investment advisor. These aren’t traditional IT side projects – they are digital products that create customer value and generate revenue. However, at the same time, to create these systems at scale, AI engineering teams need to be able to access and operationalize the data that traditional teams have prepared.

Executives have noticed. The expectations of data teams are sky-high now, with boards and CEOs looking to us to deliver the next AI-fueled growth vector. We’ve gone from being behind-the-scenes analysts to front-line innovators. It’s a thrilling position to be in, but it comes with intense pressure to deliver results at scale.

From Exploration to Product – A One-Way Door

The shift from exploratory analysis to product-centric AI is profound and irreversible. Why irreversible? Because GenAI’s impact on business is proving too great to relegate back to an R&D toy. According to a recent global survey, 96% of IT leaders have now integrated AI into their core processes – up from 88% just a year prior. In other words, nearly every enterprise has gone from experimenting with AI to embedding it in mission-critical workflows. Once you cross that threshold where AI is delivering value in production, there’s no going back.

This new AI-driven focus changes the tempo and mindset of data teams. In the past, we had the luxury of long discovery projects and open-ended analysis. Today, if we’re building an AI feature, it needs to be production-ready, compliant, and reliable – like any customer-facing product. We’ve entered what some call the “Autonomous Age” of data science. The question guiding our work is no longer “what insights can we uncover?” but “what intelligent system can we build that acts on insights in real time?”

GenAI systems aren’t just answering questions; they’re beginning to make decisions. It’s a one-way door: after experiencing this kind of autonomy and impact, companies won’t settle for static reports and manual decision-making. Now more than ever, data teams need to be stakeholder and product-oriented.

The Hard Truth: Why Most GenAI Initiatives Fail

Amid all the excitement, there’s a sober reality: most GenAI initiatives fail. It turns out that successfully deploying GenAI is extremely challenging. A recent MIT study found that a staggering 95% of enterprise GenAI pilot projects never deliver a measurable ROI. Only about 5% of AI pilots actually achieve rapid revenue gains or meaningful business impact. This isn’t due to lack of potential – it’s due to the complexity of doing AI right.

Digging into the causes of failure, the MIT research paints a clear picture. Many projects stumble because of “hype over hard work” – teams chase flashy demo use cases instead of investing in the boring fundamentals of integration, validation, and monitoring. Others fail from the classic “garbage in, garbage out” syndrome – poor data quality and siloed data pipelines doom the project before the AI even gets to do its job. Often, it’s not the AI model that’s flawed, it’s the surrounding environment. As the researchers put it, GenAI doesn’t fail in the lab; it fails in the enterprise when it collides with vague goals, poor data, and organizational inertia. In practice, most AI pilots stall at the proof-of-concept stage and never graduate to full production deployment.

This reality check is a valuable lesson. It tells us that even though data teams are now in the spotlight, the majority are struggling to meet the heightened expectations. For GenAI to succeed at scale, we must cross a significantly higher bar than we did in the old BI days.

Beyond Clever Prompts: Data, Governance & Infrastructure Matter

What separates the 5% of AI projects that thrive from the 95% that falter? In my experience (and as research confirms), the winners focus on foundational capabilities – data, governance, and infrastructure. GenAI is not magic; it’s built on data. Without high-quality, well-governed data pipelines feeding your models, even the best AI will produce erratic results. Summit Partners put it well in a recent analysis: “the success of any system or process using AI hinges on the quality, structure and accessibility of the data that fuels it.”

In practical terms, this means organizations must double down on data architecture and governance as they adopt GenAI. Do you have unified, accessible data stores that your AI can draw on (and I mean ALL data stores, including data centers, hyperscalers, and third-party SaaS systems, among others)? Is that data cleaned, curated, and compliant with regulations? Is there clear data lineage and auditability (so you can trust AI outputs and know how they came to be)? These questions are now at the forefront.

GenAI is forcing companies to finally get their data house in order.

Governance has also taken on new significance. When an AI model can potentially generate a wrong answer (or an offensive one), robust governance isn’t optional – it’s mandatory. Controls such as versioning, bias checks, human-in-the-loop review, and strict security measures around sensitive data inputs are essential. Without proper governance, training and clearly defined goals, even a strong AI tool will struggle to gain traction in business.

And let’s not forget infrastructure. Deploying GenAI at scale requires significant computing power and rigorous engineering. Models need to be served in real-time, across possibly millions of queries with low latency. They often need GPUs or specialized hardware, as well as ongoing monitoring, retaining, and lifecycle management. In short, you need industrial-grade AI infrastructure that is secure, scalable, and resilient. This is where the concept of Private AI comes in as the framework that unites infrastructure with data and governance. Private AI refers to the development of AI within a controlled and secure environment, ensuring data security and compliance.

The bottom line is that GenAI’s success depends on the harmony of three pillars: data, governance, and infrastructure. Without one, you risk joining the 95% of projects that never scale beyond the demo stage.

Why AI Engineers Can’t Do It Alone

Given these requirements, it’s clear that simply hiring a few talented AI engineers is not a silver bullet. We’ve learned this lesson over the past several years in the data industry. In the early days of the data science boom, companies tried to find “unicorn” data scientists who could do it all – build models, write code, handle data and deployment. That myth has since been dispelled. As one veteran data scientist quipped, “a model sitting in a notebook doesn’t actually do anything for the business.” You need to embed that model into an application or process for it to create value. And doing that requires a team effort that spans multiple skill sets.

In the late 2010s, we saw data teams diversify into distinct roles: data engineers began building robust pipelines, machine learning engineers focused on productionizing models, analytics engineers managed the analytics layer, and so on.

Today, GenAI raises the bar even higher. Yes, you need AI specialists (prompt engineers, LLM fine-tuners, etc.) but those specialists will hit a wall if they don’t have mature data pipelines, governance frameworks, and secure platforms to work with. An AI engineer can prototype a great language model in a sandbox but turning that into a product used by thousands or millions requires collaboration with security teams, compliance officers, data architects, site reliability engineers, and more.

AI is a team sport. It’s tempting to think you can drop a state-of-the-art model into your business and suddenly have an AI-driven enterprise. The companies succeeding with AI are those that have built cross-functional teams, or “AI factories,” which bring all these pieces together. Their data teams have effectively evolved into full-stack AI product teams, blending data, modeling, engineering, and ops expertise. They are building and deploying their tools in a data-driven, product-led way, with value generation embedded in every KPI.

The Next Generation of Data Teams

So, what does the future hold for the new “data team”? Here’s a glimpse of what’s coming for these teams in the next few years:

Less manual ETL/ELT: Tedious data wrangling will diminish. With more automated data pipelines and AI-assisted integration, teams won’t spend half their time cleaning and moving data. The grunt work of data prep will be increasingly handled by intelligent systems, allowing humans to focus on higher-level design and quality control.
Fewer dashboards: The era of endlessly tweaking dashboard filters is waning. AI will enable more natural language querying and dynamic insights delivery. Instead of pre-built dashboards for every question, users will get conversational answers from AI (with source data attached). Data teams will spend less time developing static reports and more time training AI to generate insights on the fly.
More AI-native product development: Data teams will be at the heart of product innovation. Whether it’s developing a new customer-facing AI feature or an internal AI tool that optimizes operations, these teams will act as product teams. They’ll employ software development practices, rapid prototyping, A/B testing, and user experience design – not just data analysis. Every data team will, in effect, become an AI product team delivering direct business value.
Autonomous agents on the rise: In the not-so-distant future, data teams will deploy autonomous AI agents to handle routine decisions and tasks. Instead of just predicting outcomes, these agents will be authorized to take certain actions (with oversight). Imagine an AI ops agent that can detect an anomaly and automatically open a remediation ticket, or a sales AI agent that tunes e-commerce pricing in real time. Data teams will be responsible for building and managing these agents, pushing the boundaries of what automation can achieve.

In light of these changes, one might indeed say “data teams as we knew them are dead.” The spreadsheet jockeys and dashboard plumbers have given way to something new: AI-first teams that are fluent in data, code, and business strategy. But far from being a eulogy, this is a celebration. The new generation of data teams is just beginning, and they are more valuable than ever

So, remember, the data engineer is dead, long live the data engineer! The data teams as we knew them are gone but long live the new data teams – may they reign in this AI-driven world with insight, responsibility, and audacity.

Related Topics:cloudera data data teams teams