Announcements

Kumo Launches KumoRFM-2, A Foundation Model Built to Replace Traditional Enterprise Machine Learning

Published April 14, 2026

Antoine Tardif, CEO & Founder of Unite.AI

Kumo has unveiled KumoRFM-2, a next-generation foundation model designed specifically for structured enterprise data—marking a fundamental shift in how organizations generate predictions from their data warehouses. Unlike traditional machine learning pipelines that require months of feature engineering and custom model development, KumoRFM-2 enables teams to generate predictions instantly using natural language, without training or specialized expertise.

At its core, the model represents a new category of AI: a relational foundation model that operates directly on enterprise data structures rather than flattening them into simplified tables. This distinction addresses one of the most persistent limitations in enterprise AI, where valuable relationships between datasets are often lost before modeling even begins.

From Static Pipelines to Real-Time Predictive Systems

Enterprise predictive analytics has historically been slow and resource-intensive. Each new use case—whether churn prediction, fraud detection, or demand forecasting—typically requires a separate pipeline, involving data cleaning, feature engineering, model training, and tuning.

KumoRFM-2 replaces that entire workflow with a single, pre-trained system.

Instead of building models, users define what they want to predict. The model interprets the request, constructs the necessary context from the underlying database, and produces predictions in a single pass. This is made possible through a combination of in-context learning and a declarative interface called Predictive Query Language (PQL), where users express the outcome they care about rather than the steps required to compute it.

The result is a shift from “building models” to “asking questions”—a change that significantly lowers the barrier to using predictive AI across an organization.

Why Relational Data Has Been So Difficult

Most existing AI systems struggle with structured enterprise data for a simple reason: they treat it incorrectly.

Traditional models, including many tabular AI systems and even large language models, rely on flattening data into a single table. But real-world enterprise data exists as interconnected systems—customers linked to transactions, transactions linked to products, products linked to inventory, all evolving over time.

Flattening this structure removes the relationships that often contain the most valuable predictive signals. It also forces teams to manually recreate those signals through feature engineering, a process that is both time-consuming and prone to error.

KumoRFM-2 avoids this entirely by operating directly on relational databases, preserving connections across tables, timestamps, and entities.

Inside the Architecture: How KumoRFM-2 Works

The key innovation behind KumoRFM-2 is its hierarchical Relational Graph Transformer architecture, which processes data at multiple levels simultaneously.

At the first level, the model analyzes individual tables using a combination of row and column attention. This allows it to understand how features relate within a table while filtering out irrelevant or noisy data early in the process. Importantly, the prediction target is introduced at this stage, meaning the model is conditioned on the task from the very beginning.

At the second level, the model performs graph-based reasoning across tables. Using foreign key relationships, it connects data from different parts of the database—such as linking a customer profile to purchase history or behavioral patterns—and identifies cross-table signals that would otherwise be lost.

At the third level, the model incorporates cross-sample attention, allowing it to learn from multiple examples at once. This enables it to generalize from a relatively small number of context examples, rather than requiring full training datasets.

This staged design is critical. It avoids the computational explosion that would come from processing every data point simultaneously, while also improving accuracy by filtering noise before deeper reasoning occurs.

In-Context Learning Replaces Training

A defining feature of KumoRFM-2 is its reliance on in-context learning instead of traditional training.

Rather than training a model for each task, KumoRFM-2 is pre-trained once on a large mix of synthetic and real-world relational data. When a user submits a prediction request, the system automatically generates a set of context examples—small subgraphs of the database paired with known outcomes.

These examples act as guidance for the model, allowing it to infer patterns and produce predictions without updating its weights. In practice, this means:

No task-specific training
No feature engineering
No model tuning

Even with as little as 0.2% of the data typically required for supervised learning, the model can achieve state-of-the-art performance.

Performance Across Real-World Benchmarks

KumoRFM-2 has been evaluated across 41 predictive tasks spanning industries such as e-commerce, healthcare, social platforms, and enterprise systems.

The model consistently outperforms traditional supervised machine learning approaches, including engineered ensembles and relational deep learning systems. On enterprise benchmarks, it surpasses widely used solutions by significant margins, while also improving further when fine-tuned.

Beyond raw accuracy, the model demonstrates strong robustness:

Maintains performance even when large portions of relational links are missing
Handles noisy or incomplete data with minimal degradation
Performs well in cold-start scenarios where historical data is limited

This resilience is particularly important in enterprise environments, where data quality is often inconsistent.

Built for Scale: Up to 500 Billion Rows

KumoRFM-2 is designed to operate at the scale of modern data infrastructure.

The system can process datasets exceeding 500 billion rows by combining database-native execution with a custom graph engine capable of high-throughput data access. Instead of moving data into a separate ML system, computation is pushed directly to where the data resides—whether in SQL databases or cloud data warehouses.

This approach reduces latency, simplifies deployment, and allows organizations to integrate predictive capabilities directly into existing workflows.

Natural Language as the Interface

Another defining feature is the model’s natural language interface.

Users can ask questions like:

Which customers are likely to churn in the next 30 days?
Which leads are most likely to convert?
Which products will see increased demand?

The system translates these queries into structured predictive logic, executes them on the underlying data, and returns both predictions and explanations.

This not only makes predictive analytics more accessible, but also enables integration with AI agents, where predictions can be embedded into automated decision-making workflows.

Toward Agent-Driven Enterprise Intelligence

KumoRFM-2 is designed with agents in mind.

Its predictive capabilities can be exposed as modular “skills” that AI agents can call as part of larger workflows. This turns predictive modeling into a composable building block—something that can be combined with retrieval, reasoning, and execution in autonomous systems.

In this context, the model is not just a tool for analysts, but a foundational layer for next-generation enterprise automation.

Redefining the Role of Data Science

KumoRFM-2 signals a broader shift in how organizations approach data science.

Instead of building and maintaining dozens of task-specific models, teams can rely on a single, general-purpose system that adapts to new problems instantly. This reduces the need for specialized expertise in feature engineering and model tuning, while enabling faster experimentation and iteration.

For many organizations, this could mean moving from a centralized data science function to a more distributed model, where predictive insights are accessible across multiple departments.

A New Category of Foundation Models

While foundation models have already transformed domains like language and vision, structured enterprise data has remained one of the last frontiers.

KumoRFM-2 represents an early example of what specialized foundation models for structured data can achieve. By combining relational reasoning, in-context learning, and natural language interaction, it introduces a new paradigm for predictive AI.

If widely adopted, this approach could redefine how businesses interact with their data—turning predictive analytics from a complex, delayed process into a real-time, organization-wide capability.

Antoine Tardif, CEO & Founder of Unite.AI

Antoine is a visionary leader and founding partner of Unite.AI, driven by an unwavering passion for shaping and promoting the future of AI and robotics. A serial entrepreneur, he believes that AI will be as disruptive to society as electricity, and is often caught raving about the potential of disruptive technologies and AGI.

As a futurist, he is dedicated to exploring how these innovations will shape our world. In addition, he is the founder of Securities.io, a platform focused on investing in cutting-edge technologies that are redefining the future and reshaping entire sectors.

Unite.AI

Kumo Launches KumoRFM-2, A Foundation Model Built to Replace Traditional Enterprise Machine Learning

From Static Pipelines to Real-Time Predictive Systems

Why Relational Data Has Been So Difficult

Inside the Architecture: How KumoRFM-2 Works

In-Context Learning Replaces Training

Performance Across Real-World Benchmarks

Built for Scale: Up to 500 Billion Rows

Natural Language as the Interface

Toward Agent-Driven Enterprise Intelligence

Redefining the Role of Data Science

A New Category of Foundation Models

You may like