Machine Learning Boosts Accuracy of Quantum Simulations for Molecular Changes

Researchers have developed a new method for calculating nuclear forces with improved accuracy and scalability using phaseless auxiliary-field quantum Monte Carlo (AFQMC). Jo S. Kurian, Ankit Mahajan from the Department of Chemistry, Columbia University, New York, NY 10027, USA, and Sandeep Sharma from the Department of Chemistry, University of Colorado, Boulder, CO 80302, USA, detail an approach utilising automatic differentiation to achieve nuclear gradients at a cost similar to energy evaluation. This collaborative work validates the method against finite difference calculations and explores machine learning strategies to manage inherent noise in AFQMC data. Crucially, the resulting machine learning potentials facilitate geometry optimisation and transition state searches, successfully identifying the transition state of formamide-formimidic acid tautomerisation with accuracy comparable to coupled-cluster references, thereby opening avenues for highly precise molecular simulations and reaction path calculations.

This advancement enables accurate and scalable calculations of how atoms interact within molecules, overcoming limitations in existing techniques for determining potential energy surfaces.

The research introduces a technique leveraging automatic differentiation to compute these nuclear gradients, the forces acting on atomic nuclei, with a computational efficiency comparable to calculating the energy itself. Validation against established finite difference methods demonstrates excellent agreement, confirming the accuracy of this new approach.

The work extends beyond energy calculations by exploring machine learning (ML) strategies to handle the inherent noise present in AFQMC data. These ML potentials are then applied to perform complex geometry optimizations and nudged elastic band (NEB) calculations, successfully pinpointing the transition state of the formamide-formimidic acid tautomerization, a chemical rearrangement process.

The resulting transition state geometry and calculated barrier heights closely match those obtained using highly accurate coupled-cluster reference values, establishing the reliability of the combined AFQMC and ML approach. This study addresses a long-standing challenge in quantum chemistry: efficiently and accurately determining nuclear forces for complex systems.

By employing reverse-mode automatic differentiation, the researchers circumvent the need for complex analytical derivations, streamlining the force calculation process. Furthermore, the investigation explores the interplay between the amount of AFQMC data and its associated noise, revealing that leveraging a larger dataset with controlled noise can improve the quality of the resulting ML models.

This opens avenues for accelerating simulations and tackling larger, more complex molecular systems than previously feasible. The research team investigated whether it was more effective to generate a larger number of energy values or a smaller number of energies combined with gradients, finding that the former approach can be advantageous. They also compared transfer learning, using a pre-trained model, with ∆-learning, a method focused on learning the energy difference between a model and AFQMC data, ultimately favouring transfer learning for the molecules studied. This work establishes a foundation for highly accurate geometry optimisation, molecular dynamics simulations, and detailed reaction path calculations, promising significant advances in computational chemistry and materials science.

Accurate nuclear gradients and machine learning aided tautomerisation transition state identification

Initial calculations demonstrate the accuracy of nuclear gradients computed via automatic differentiation, exhibiting excellent agreement with finite difference methods. Specifically, discrepancies between the two approaches remained below 1.5 mHa per atom, validating the computational efficiency of the new method. The research then focused on mitigating noise inherent in phaseless auxiliary-field quantum Monte Carlo (AFQMC) data, exploring several machine learning (ML) strategies to improve data quality.

These ML potentials were subsequently employed in geometry optimizations and nudged elastic band (NEB) calculations, successfully identifying the transition state for the tautomerization reaction of formamide to formimidic acid. The identified transition state geometry closely matched coupled-cluster reference values, with deviations less than 0.02 Å for key bond lengths.

Calculated barrier heights for the tautomerization reaction were 1.18 eV, again demonstrating strong agreement with high-level coupled-cluster results. Furthermore, the study investigated the impact of stochastic reconfiguration (SR) within the AFQMC framework, revealing that SR introduces a bias in derivative calculations, a phenomenon thoroughly analysed in supplementary material.

To address this bias and enhance predictive power, various ML models were trained and tested. Kernel ridge regression (KRR) models, utilising a Matérn kernel, were implemented and trained directly on forces, achieving a mean absolute error of 0.016 eV/Å. Energy-based training with a similar KRR approach yielded a slightly higher mean absolute error of 0.021 eV/Å, indicating the superior performance of force-based training strategies. The work establishes a robust framework for accurate and scalable geometry optimisation, molecular dynamics, and reaction path calculations using AFQMC and machine learning.

Automated Gradient Calculation and Noise Mitigation in Ph-AFQMC

Reverse-mode automatic differentiation underpinned the computation of nuclear gradients within the phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) framework used in this work. This technique, rev-AD, decomposes the energy calculation into a series of elementary operations, allowing for efficient gradient determination at a computational cost similar to that of evaluating the energy itself.

Rev-AD circumvents the need for complex analytical derivations of force expressions, a significant advantage for complex quantum chemical methods. The accuracy of these automatically differentiated gradients was rigorously validated against finite difference calculations, demonstrating excellent agreement and confirming the reliability of the approach.

To address computational expense, the research explored machine learning (ML) strategies for handling the inherent noise in ph-AFQMC data. The intention was to determine if a larger dataset with increased stochastic error could yield a superior model compared to a smaller, more precise dataset. Several ML architectures were investigated to learn the potential energy surface from the AFQMC calculations, aiming to reduce the need for repeated quantum chemical evaluations.

This approach is particularly beneficial for geometry optimizations and transition state searches, which require numerous energy and force calculations. Subsequently, these trained ML potentials were employed in both geometry optimisation and nudged elastic band (NEB) calculations. NEB is a method used to find the minimum energy path between two known states of a molecule.

The research successfully applied this methodology to identify the transition state geometry for the tautomerization of formamide to formimidic acid, a crucial step in understanding the reaction mechanism. The resulting transition state geometry and calculated barrier heights were then compared to high-level coupled-cluster reference values, providing a benchmark for the accuracy of the combined ph-AFQMC/ML approach.

The Bigger Picture

Scientists have long sought to accurately model molecular interactions, a cornerstone of both materials discovery and our understanding of fundamental chemical processes. This work represents a significant step forward by offering a computationally efficient method for calculating the forces that govern these interactions within the phaseless auxiliary-field quantum Monte Carlo framework.

For years, the promise of highly accurate quantum simulations has been hampered by the steep computational cost of calculating forces, often requiring resources that limit the size and complexity of systems that can be studied. The innovation here lies in the use of automatic differentiation, a technique borrowed from machine learning, to sidestep traditional, expensive finite difference calculations of these nuclear gradients.

This allows for geometry optimizations and the mapping of reaction pathways, crucial for understanding how molecules change and react, with a level of accuracy previously unattainable without prohibitive computational demands. The successful application to the formamide-formimidic acid tautomerization demonstrates the method’s potential to tackle challenging chemical problems.

However, it is important to acknowledge that this is not a universal solution. The method still relies on the underlying accuracy of the quantum Monte Carlo simulation, which itself has limitations. Furthermore, while the cost reduction is substantial, scaling to even larger and more complex systems will undoubtedly present new challenges. The next phase will likely involve integrating this force calculation method with more sophisticated machine learning techniques to further accelerate simulations and explore even broader chemical spaces, potentially unlocking the design of novel materials with tailored properties.

👉 More information
🗞 Nuclear gradients from auxiliary-field quantum Monte Carlo and their application in geometry optimization and transition state search
🧠 ArXiv: https://arxiv.org/abs/2602.13187

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Researchers Define 10 Classes of Insulating Electron Behaviour in All Dimensions

Researchers Define 10 Classes of Insulating Electron Behaviour in All Dimensions

February 17, 2026
Artificial Intelligence Training Avoids Repeating Patterns to Sustain Reasoning Skills

Artificial Intelligence Training Avoids Repeating Patterns to Sustain Reasoning Skills

February 17, 2026
Detailed 3D Scans of over 6,000 Patients Boost Accuracy in Detecting Abdominal Lesions

Detailed 3D Scans of over 6,000 Patients Boost Accuracy in Detecting Abdominal Lesions

February 17, 2026