A Deep Dive into Tabular In-Context Learning

ICL Models vs. Boosting for Tabular Data, Where New Transformer Approaches Excel and Where Traditional ML Methods Still Dominate.

from

Orr Shahar

By Orr Shahar / Machine Learning Engineer

Testing the Future of Tables

In the rapidly evolving landscape of machine learning, deep learning and Transformers have become the gold standard for language and vision. Yet, for years, the world of tabular data, the backbone of countless domains such as finance or manufacturing, remained an outsider to this revolution. Boosting algorithms like XGBoost and Random Forests have dominated this field thanks to their efficiency and high performance.

Recently, a new approach emerged that aims to challenge this dominance: In-Context Learning (ICL) for tabular data. At Merantix Momentum, we went under the hood to see if these models can truly challenge the status quo and provide a superior AI solution for modern data challenges.

We dived deep into three prominent in-context models, comparing their unique characteristics and architectures while testing them against industry-standard benchmarks. By mapping out specific best use cases and assessing computational needs, our goal was to determine how exactly businesses can best reap the benefits of new AI developments.

Background and In-Context Tabular Models

While Transformers excel at sequential data, the independent structure of columns in tabular data has historically reduced the effectiveness of standard attention mechanisms.

New AI research has recently addressed this through Tabular ICL models, which operate by using a sample of the labeled training set as a "context window". Each unit we wish to predict then "attends" to these rows to predict the target based on relevant similarities and structure.

These models are pre-trained on a vast number of datasets, aiming to learn all kinds of patterns and structures in tabular data. This means that they can be applied to a dataset as an inference step without additional training.

The authors claim that this approach offers a high-speed, high-performance application.

The Models at a Glance:

TabPFN-2.5: This model, developed by Prior Labs, is considered a pioneer of the ICL approach. It was trained on synthetic data to learn general patterns as a prior for real-world datasets and switches between row and column attention.
ConTextTab: This SAP model is “semantics-aware.” It uses a large language model (LLM) to understand the meaning of column names such as “Credit Score” or “Age.” This makes it particularly powerful when semantic information plays a major role.
TabICLV2: An academic model from ICML 2025 that was also trained on synthetic data. It uses distribution-aware embeddings and performs early dimensionality reduction, which enables it to handle high-dimensional datasets particularly well.

The Experiment: Putting Theory to the Test

To get a clear picture of performance, we used OpenML to fetch diverse datasets across various domains - ranging from credit risk to college application approval. We categorized these into six "buckets" based on their size and dimensionality in order to get clear, granular results.

The Results: Where ICL Wins, and Where It Doesn’t

Our findings revealed a clear "sweet spot" for In-Context Learning:

Small datasets: For datasets with fewer than 5,000 rows, ICL models (particularly TabPFN and TabICL) are clearly superior. They consistently deliver the best performance and outperform traditional boosting methods. Furthermore, they are significantly faster when training and tuning times for the boosting models are taken into account.
Transition zone: Between 30,000 and 50,000 rows, the competition becomes tighter. There is no longer a clear winner, though ICL models demonstrate their robustness, particularly with high-dimensional and sparse data.
Scaling limit: Starting at around 100,000 lines, traditional methods such as boosting regain a clear advantage in both performance and speed. Especially in production workflows with continuous use, boosting models are significantly more efficient during inference.

The Bottom Line

Implementing ICL models is relatively straightforward: they come with ready-made libraries and, unlike boosting methods, do not require a complex feature engineering pipeline. They are a true “plug-and-play” tool for tabular data that can even process text data. However, there are limitations: They are primarily optimized for GPU environments and can be slow on CPUs even with small datasets. Another important point is the cost model: Boosting models have training and tuning costs (one-time), but are extremely efficient in inference afterward. ICL models are exactly the opposite and are therefore often more expensive in continuous production environments.

Implementing ICL models is quite straightforward; they come with built-in libraries and unlike boosting algorithms, require no pre-processing or feature engineering. They offer a super easy plug-and-play solution that can process any kind of tabular data, including text. However, it is important to note that they are built for GPU execution and can be quite slow on a standard CPU, even for small datasets. Another important point is that while boosting algorithms require compute time for tuning and training, which are one-time tasks, they are very light in inference, a repetitive task. ICL models are the exact opposite, making them significantly more costly in production environments with continuous data flow.

Tabular in-context learning is an exciting attempt to break the long-standing dominance of boosting algorithms—and in certain scenarios, it is already succeeding today. ICL models actually have an advantage with small datasets. However, scalability remains a key challenge for future research. The larger the dataset, the better classical methods perform. For large amounts of data, batch processing and subsampling could be potential solutions.

As with most AI applications, the best solution depends on the context. Tabular ICL models are a valuable addition to the toolkit—but they do not eliminate the need to select the right model for the specific problem. As experts in customized AI solutions, we at Merantix Momentum always take a holistic approach, giving equal consideration to performance, data, resources, and user experience. This results in solutions that are precisely tailored to our customers’ needs.

Need a hand designing or implementing your AI strategy? Get in touch and we’ll take care of the rest.

‍

Subscribe to the Merantix Momentum Newsletter now.

The latest industry news, interviews, technologies and resources.

All articles

Can a diverse team develop better AI? The answer is a resounding yes.

Innovation in AI starts with the people who shape it. At Merantix Momentum, we believe that diverse teams develop fairer, more effective AI - and turn inclusion into a competitive advantage.

Transforming complex AI challenges into real impact: The intelligent healthcare system for the next decade

How AI, collaboration, and secure data infrastructures are redefining research, diagnostics, and care, paving the way for the intelligent healthcare system of the next decade.

Article

AI is now part of our culture. Communities will determine what comes next

How AI is transforming our society, and why it is not technology alone, but communities, values, and responsibility that will determine the future we shape with AI.

Article

From data chaos to breakthroughs

An expert interview with Dr. Stephan Hegge, VP of Corporate Strategy at HotSpot Therapeutics and Dr. Thomas Wollmann, CTO at Merantix Momentum

Article

Our publications

Discover the latest publications from our research team and more

A Deep Dive into Tabular In-Context Learning

By Orr Shahar / Machine Learning Engineer

Testing the Future of Tables

Background and In-Context Tabular Models

While Transformers excel at sequential data, the independent structure of columns in tabular data has historically reduced the effectiveness of standard attention mechanisms.

The authors claim that this approach offers a high-speed, high-performance application.

The Models at a Glance:

TabPFN-2.5: This model, developed by Prior Labs, is considered a pioneer of the ICL approach. It was trained on synthetic data to learn general patterns as a prior for real-world datasets and switches between row and column attention.
ConTextTab: This SAP model is “semantics-aware.” It uses a large language model (LLM) to understand the meaning of column names such as “Credit Score” or “Age.” This makes it particularly powerful when semantic information plays a major role.
TabICLV2: An academic model from ICML 2025 that was also trained on synthetic data. It uses distribution-aware embeddings and performs early dimensionality reduction, which enables it to handle high-dimensional datasets particularly well.

The Experiment: Putting Theory to the Test

The Results: Where ICL Wins, and Where It Doesn’t

Our findings revealed a clear "sweet spot" for In-Context Learning:

Small datasets: For datasets with fewer than 5,000 rows, ICL models (particularly TabPFN and TabICL) are clearly superior. They consistently deliver the best performance and outperform traditional boosting methods. Furthermore, they are significantly faster when training and tuning times for the boosting models are taken into account.
Transition zone: Between 30,000 and 50,000 rows, the competition becomes tighter. There is no longer a clear winner, though ICL models demonstrate their robustness, particularly with high-dimensional and sparse data.
Scaling limit: Starting at around 100,000 lines, traditional methods such as boosting regain a clear advantage in both performance and speed. Especially in production workflows with continuous use, boosting models are significantly more efficient during inference.