The accurate modelling of interactions between multiple electrons – the many-body problem – continues to present a significant challenge in materials science, particularly when determining excited-state properties crucial to understanding optical and transport behaviours. Bowen Hou, Xian Xu, and colleagues at Yale University present MBFormer, a novel machine learning model designed to learn these complex many-body interactions directly from initial, simplified calculations. The model utilises the transformer architecture, commonly employed in natural language processing, and an attention mechanism to capture correlations between electronic states, bypassing the need for computationally expensive traditional methods like the GW plus Bethe-Salpeter equation (GW-BSE) formalism. Trained on a substantial dataset of two-dimensional materials, MBFormer achieves high accuracy in predicting quasiparticle and exciton energies, demonstrating a potential pathway towards accelerated materials discovery and characterisation.
The convergence of machine learning (ML) and first-principles calculations is accelerating materials discovery and enabling the study of complex systems. While density functional theory (DFT) remains the dominant approach, it is limited by its ground-state nature and inability to accurately describe excited-state properties governed by many-body interactions. Traditional many-body perturbation theory (MBPT) methods, such as the GW and Bethe-Salpeter equation (BSE) formalisms, offer high accuracy but are computationally expensive, scaling unfavourably with system size.
Recent approaches have explored learning post-DFT quasiparticle bandstructures and spectral functions, employing techniques like manual feature optimisation, graph neural networks (GNNs) for Green’s functions, and unsupervised representation learning of KS states. These models, however, are typically limited to specific many-body problems and lack a unified mechanism for capturing long-range correlations directly from mean-field inputs.
To address these limitations, researchers have developed MBFormer, a novel transformer-based architecture designed to learn the entire many-body hierarchy from ground-state mean-field wavefunctions. Transformers, renowned for their parallelism, scalability, and attention mechanisms, offer a promising framework for capturing complex interactions. MBFormer provides an end-to-end pipeline that tokenises and embeds KS states, then decodes task-specific tokens to predict excited-state properties, such as quasiparticle energies and exciton wavefunctions.
The model was trained on a dataset of 721 two-dimensional semiconductors from the C2DB database, utilising GW-BSE calculations as ground truth. MBFormer achieves state-of-the-art performance in predicting quasiparticle energies, exhibiting a mean absolute error (MAE) of 0.16 eV and a high R² value of 0.97. This demonstrates the model’s ability to accurately predict excited-state properties across diverse materials.
The architecture provides a unified framework for predicting various excited-state properties from a single mean-field input, offering a potentially scalable and efficient approach to modelling complex materials behaviour. The demonstrated capability of learning complex many-body effects from ground-state inputs positions MBFormer as a promising foundation model for accelerating materials discovery and design. Future work will likely focus on expanding the dataset to include a wider range of materials and properties, exploring different transformer architectures, and integrating the model into existing computational workflows.
More information
MBFormer: A General Transformer-based Learning Paradigm for Many-body Interactions in Real Materials
DOI: https://doi.org/10.48550/arXiv.2507.05480
