Researchers are tackling the crucial challenge of explaining predictions made by machine learning models, particularly in scenarios where direct access to those models is unavailable. Joao Fonseca, from INESC-ID and New York University, alongside Julia Stoyanovich of New York University, present a novel approach called ExplainerPFN, a tabular foundation model designed to estimate feature importance without needing to evaluate the target model itself. This work is significant because it offers a potential solution for model-free, zero-shot feature importance estimations, bypassing the computational expense and access limitations associated with traditional methods like Shapley values. By pre-training on synthetic data and leveraging TabPFN, ExplainerPFN predicts feature attributions for unseen tabular datasets with competitive performance against existing explanation techniques, and the authors provide a fully open-source implementation to facilitate further research in this area.

Researchers introduced ExplainerPFN, a tabular foundation model designed to estimate Shapley values, a widely used method for explaining model predictions, in a zero-shot setting.

This breakthrough addresses a critical limitation of existing Shapley value computations, which typically demand direct access to the model, an assumption often violated in real-world applications and can be computationally expensive. The study reveals that meaningful Shapley value estimations are achievable using only the input data distribution, eliminating the need for model evaluations.
The team achieved this by constructing ExplainerPFN upon the TabPFN architecture, pretraining it on synthetic datasets generated from random structural causal models and supervised using exact or near-exact Shapley values. Once trained, ExplainerPFN predicts feature attributions for previously unseen tabular datasets without needing model access, gradients, or example explanations.

Experiments demonstrate that few-shot learning-based explanations can achieve high fidelity to SHAP values with as few as two reference observations, representing a significant reduction in required data. This work establishes ExplainerPFN as the first zero-shot method for estimating Shapley values, offering a substantial innovation in model-free feature importance estimation.

Researchers have also provided a fully open-source implementation, including the training pipeline and synthetic data generator, facilitating further research and application. Through extensive experiments on both real and synthetic datasets, the study proves that ExplainerPFN’s performance is competitive with few-shot surrogate explainers that rely on 2, 10 SHAP examples, highlighting its practical viability.

The research opens new avenues for auditing decisions, identifying potential biases, and improving transparency in black-box machine learning systems, particularly in high-stakes domains like loan approvals, hiring processes, and medical diagnoses. By learning attribution patterns directly from data, ExplainerPFN offers a promising direction in zero-shot explainability, leveraging the power of tabular foundation models to address the growing need for interpretable and trustworthy artificial intelligence.

Training and validation of a zero-shot tabular feature attribution model requires careful consideration of data leakage

Scientists developed ExplainerPFN, a novel tabular foundation model designed to estimate Shapley values without requiring access to the underlying machine learning model or example explanations. The research team engineered this zero-shot method to address the critical need for model interpretability in real-world deployments where model access is often restricted.

This approach leverages the data distribution itself to predict feature attributions, moving beyond traditional methods reliant on model internals. Researchers initially trained ExplainerPFN on synthetic datasets generated using random structural causal models, then supervised the model using exact or near-exact Shapley values.

This pre-training phase equipped the model to learn attribution patterns directly from data, enabling it to generalise to unseen tabular datasets. The study pioneered the use of TabPFN as a foundation, adapting its synthetic prior-fitting procedure specifically for feature attribution learning and creating a complete training pipeline.

Experiments employed a few-shot learning paradigm, demonstrating that ExplainerPFN can achieve high fidelity to SHAP values with as few as two reference observations. The team assessed performance on both real and synthetic datasets, comparing ExplainerPFN against few-shot surrogate explainers that depend on 2, 10 SHAP examples.

This rigorous evaluation confirmed that ExplainerPFN achieves competitive performance, suggesting the feasibility of recovering meaningful feature attribution patterns directly from the data distribution. Furthermore, the study provides a fully open-source implementation of ExplainerPFN, including the synthetic data generator and pretrained weights, facilitating reproducibility and further research. This innovative methodology addresses the limitations of existing feature attribution techniques, which can be sensitive to data correlations and require direct model access, offering a new direction in zero-shot explainability for tabular data.

Zero-shot Shapley value estimation from tabular data using foundation models is a promising research direction

Scientists have developed ExplainerPFN, a novel tabular foundation model capable of estimating Shapley values without direct access to the underlying predictive model. The research team demonstrated that explanations based on ExplainerPFN can achieve high fidelity to SHAP values using as few as two reference observations.

This breakthrough delivers a zero-shot method for estimating Shapley values, a critical step towards model interpretability in real-world deployments where model access is often restricted. Experiments revealed that ExplainerPFN accurately predicts feature attributions for unseen tabular datasets without requiring model access, gradients, or example explanations.

The team measured performance competitive with few-shot surrogate explainers, achieving results comparable to those relying on 2, 10 SHAP examples. Data shows that the model effectively recovers meaningful feature attribution patterns directly from the data distribution, opening new avenues for zero-shot explainability.

Researchers introduced an open-source implementation of ExplainerPFN, including the complete training pipeline and a synthetic data generator. Through extensive testing on both real and synthetic datasets, the study confirmed that ExplainerPFN achieves performance competitive with existing few-shot surrogate explainers.

Specifically, the work demonstrates that meaningful feature attribution patterns can be recovered directly from the data distribution, a significant advancement in zero-shot explainability using tabular foundation models. The study details the computation of Shapley values, a concept from cooperative game theory used to fairly distribute gains or costs among players based on their contributions.

Scientists computed the Shapley value φj i for each instance xi, using the formula provided, and assessed the fidelity of ExplainerPFN’s estimations against these values. Measurements confirm that the model’s performance is competitive, requiring only minimal reference data to achieve comparable results to methods that rely on extensive model queries.

Furthermore, the team formally defined a tabular foundation model as a pretrained mapping, F, and demonstrated its application in in-context learning for feature importance estimation. Tests prove that the model can accurately estimate per-instance feature importance for a fixed, inaccessible predictive model without any prior knowledge of feature importance scores. The research provides a foundation for future work in zero-shot explainability and model interpretation.

Zero-shot feature attribution via foundation models and Shapley values offers interpretable insights

Scientists have developed a new method for estimating feature importance in supervised classification tasks without requiring access to the underlying model. Researchers introduced ExplainerPFN, a tabular foundation model pretrained on synthetic data generated from structural causal models and utilising Shapley values.

This model predicts feature attributions for unseen tabular datasets, offering a zero-shot approach to model-free explainability. The key contribution lies in demonstrating that meaningful Shapley value estimations are achievable without model access or reference explanations, relying solely on the input data distribution.

ExplainerPFN achieves performance comparable to few-shot surrogate explainers, while significantly reducing computational costs when direct Shapley value calculation is impractical. The authors acknowledge limitations including the approximate nature of the attributions and potential reduced reliability in high-dimensional settings or when the pretraining distribution does not adequately represent real-world data.

Future research directions include exploring formal guarantees, robustness to distribution shifts, and best practices for integrating this tool into responsible machine learning pipelines. This work suggests that valuable attribution structure can be recovered from data alone, offering a promising avenue for scalable, model-free explainability in tabular data analysis.

The findings have potential positive impacts on transparency in automated decision systems, preliminary fairness audits, and reducing computational burdens. However, users should interpret these explanations cautiously, recognising they are approximations and not substitutes for model-access methods when available.

👉 More information
🗞 ExplainerPFN: Towards tabular foundation models for model-free zero-shot feature importance estimations
🧠 ArXiv: https://arxiv.org/abs/2601.23068

Tags:

Feature Attribution model interpretability. Shapley Values structural causal models TabPFN tabular foundation models zero-reference explanations

Explainerpfn Shows Zero-Shot Feature Importance Estimations Using Tabular Foundation

Training and validation of a zero-shot tabular feature attribution model requires careful consideration of data leakage

Zero-shot Shapley value estimation from tabular data using foundation models is a promising research direction

Zero-shot feature attribution via foundation models and Shapley values offers interpretable insights

Rohail T.

Latest Posts by Rohail T.:

AI Swiftly Answers Questions by Focusing on Key Areas

Machine Learning Sorts Quantum States with High Accuracy

Framework Improves Code Testing with Scenario Planning