Machine Learning Advances MEP Prediction with Quadrupole Moments, Improving QM9 Dataset Accuracy

Understanding how molecules interact relies heavily on accurately describing their electrostatic properties, and researchers are increasingly turning to machine learning to accelerate these calculations. Kadri Muuga, Lisanne Knijff, and Chao Zhang, all from the Department of Chemistry-Ångström Laboratory at Uppsala University, investigated whether machine learning models could effectively predict molecular electrostatic potentials using dipole and quadrupole moments as training data. Their work demonstrates that incorporating quadrupole contributions significantly enhances the accuracy of these models, improving predictions of intermolecular interactions. This finding is particularly important as it highlights the quadrupole moment’s crucial role in developing faster, more efficient computational methods for modelling chemical systems and exploring broad regions of chemical space, as validated using both the QM9 and SPICE datasets. The research offers a pathway towards rapid access to molecular electrostatic potentials, potentially revolutionising fields such as drug discovery and materials science.

A convolutional neural network architecture, PiNet2, was developed and trained using dipole and quadrupole moments. Analysis of the established QM9 dataset reveals that incorporating the quadrupole contribution into machine learning models significantly enhances their ability to recover the molecular electrostatic potential (MEP) when compared to models utilising only dipole moments. This observed trend is further corroborated by results obtained from the SPICE dataset, which represents a considerably wider range of organic chemical space. The research highlights the crucial role of the quadrupole moment as a key target for machine learning models designed to facilitate rapid access to the MEP. Molecular electrostatic potentials provide a concise method for describing molecular interactions.

Summary Method

This research paper presents the development and validation of PhysNet, a novel machine learning model designed to predict multiple molecular properties directly from atomic structure. Unlike many existing approaches that primarily focus on energies and forces and require separate post-processing steps for other properties, PhysNet is able to simultaneously predict energies, atomic forces, dipole moments, and partial charges within a single, unified framework. This direct and integrated prediction capability represents a significant advancement in the application of machine learning to molecular modeling.

The model is trained and evaluated using two major datasets. The first is QM9, which contains approximately 134,000 small organic molecules with quantum-mechanically computed properties and serves as a standard benchmark in the field. The second is the larger and more diverse SPICE dataset, comprising over 270,000 entries that include drug-like molecules and peptides, making it particularly relevant for biomolecular applications. The PhysNet architecture is specifically designed to handle the tensorial nature of molecular properties, ensuring that physical symmetries and interactions are appropriately captured. Training relies on high-quality reference data generated from density functional theory calculations using the ωB97M-V functional with the def2-SVP basis set, and model optimization is performed using the Adam algorithm.

In terms of performance, PhysNet achieves state-of-the-art or highly competitive accuracy across both datasets, as measured by root mean squared error for energies, forces, dipole moments, and partial charges. The model demonstrates especially strong performance in predicting dipole moments and atomic charges, properties that are often challenging for machine learning potentials. Moreover, PhysNet exhibits good transferability, showing the ability to generalize effectively to molecular systems that were not included in the training set. Comparisons with established methods such as ANI-1x, SchNet, and DimeNet++ further highlight the robustness and competitiveness of the approach.

Overall, this work introduces a powerful and versatile machine learning potential that significantly broadens the scope of directly predictable molecular properties. By integrating the prediction of dipole moments and partial charges alongside energies and forces, PhysNet provides a more complete and physically meaningful description of molecular systems. In addition, the introduction and use of the SPICE dataset represents a valuable contribution to the community, offering a rich resource for developing and benchmarking next-generation machine learning models in computational chemistry and materials science.

Quadrupole Moments Enhance Electrostatic Potential Prediction

Scientists achieved a significant breakthrough in predicting molecular electrostatic potential (MEP) using machine learning (ML) models. The research focused on the ability of PiNet2, an equivariant convolutional architecture, to infer MEP based on molecular dipole and quadrupole moments. Experiments revealed that incorporating quadrupole contributions into the ML models substantially improved MEP recovery compared to models relying solely on dipole moments. This enhancement was consistently observed across both the QM9 and SPICE datasets, demonstrating the robustness of the approach. The team meticulously assessed seven distinct PiNet2-based ML models, varying in their use of atomic charge (AC) and atomic dipole (AD) predictions alongside dipole and quadrupole moments.

Results demonstrate that the AC-DQ model, combining AC predictions for both dipole and quadrupole moments with a 1:1 loss weight ratio, showed marked improvement. Further refinement with the AC-DQ-dw100 model, employing a 100:1 dipole-quadrupole loss weight ratio, further optimised performance. Measurements confirm that regressing molecular multipoles, which are experimentally observable quantities, provides a generic strategy for rapidly inferring MEP from ML models without direct training on MEP or electron density. For the QM9 dataset, consisting of 133,885 stable organic molecules, the researchers computed dipole and quadrupole moments at the B3LYP/6-31G(2df,p) level of theory.

Following data filtering, 9,723 molecules were excluded due to inconsistencies, leaving a robust dataset for model training. The Gaussian quadrupole moments were traceless and scaled by a factor of 3 to align with the research’s equation for quadrupole calculation. Tests prove that the same large PiNet2 model size, previously successful for dipole predictions, was optimally suited for this combined approach. Expanding the study, the team applied their methodology to the SPICE 2.0 dataset, comprising 91,420 organic structures. Data was filtered to include only the lowest energy conformers, providing a complementary chemical space to QM9. Measurements show that a medium-sized model proved sufficient for training on the SPICE dataset, highlighting the adaptability of the approach. The breakthrough delivers a powerful new technique for rapidly and accurately predicting MEP, with potential applications in drug discovery, materials science, and broader areas of computational chemistry.

Quadrupole Moments Enhance Electrostatic Potential Prediction

This research demonstrates that machine-learning models can effectively infer molecular electrostatic potential (MEP) using equivariant convolutional architectures. Importantly, the inclusion of quadrupole moments significantly improves the accuracy of MEP recovery compared to models relying solely on dipole moments, a finding consistently observed across both the QM9 and SPICE datasets. These results suggest the quadrupole moment is a more effective primary target for machine-learning based atomic charge models than the dipole moment alone. The study establishes that while dipole moments traditionally dominate understanding of neutral molecules, incorporating quadrupole information enhances the predictive power of models designed to reproduce the MEP.

This is particularly valuable as the quadrupole moment, like the dipole, is experimentally measurable but requires less data storage than a full charge density. Consequently, the work offers a practical approach to rapidly accessing the MEP, a crucial element in the design of solvents and electrolytes. The authors acknowledge that optimizing for quadrupole moments can slightly reduce the accuracy of predicted dipole moments. Future research could explore strategies to simultaneously refine both properties within the models. This work was supported by funding from the European Research Council, the Wallenberg Initiative Materials Science for Sustainability, and the Swedish Research Council, with computational resources provided by the National Academic Infrastructure for Supercomputing in Sweden.

👉 More information
🗞 Molecular electrostatic potentials from machine learning models for dipole and quadrupole predictions
🧠 ArXiv: https://arxiv.org/abs/2601.10320

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Double Markovity Advances Quantum Systems with Four-Party State Analysis

Double Markovity Advances Quantum Systems with Four-Party State Analysis

January 20, 2026
Learning States from Circular and Gaussian Ensembles Achieves Average-Case Hardness

Learning States from Circular and Gaussian Ensembles Achieves Average-Case Hardness

January 20, 2026
Entanglement Entropy Advances Understanding of Root-Deformed AdS/CFT in Three-Dimensional Space

Entanglement Entropy Advances Understanding of Root-Deformed AdS/CFT in Three-Dimensional Space

January 20, 2026