Scaling Equivariant Models for Accurate Biomolecular Simulations of Realistic Size by Harvard Research Students

Scaling Equivariant Models For Accurate Biomolecular Simulations Of Realistic Size By Harvard Research Students

In the study published by Musaelian, Johansson, and Batzner of Harvard John A. Paulson School of
Engineering and Applied Sciences on 20 April 2023, the authors introduced a new architecture called Allegro, which combines the accuracy, efficiency, and robustness of deep equivariant neural networks to achieve extreme computational scale through innovative model architecture, parallelization, and GPU optimization.

The accurate prediction of the evolution of matter on the atomic scale is a crucial component of modern computational biology, chemistry, and materials engineering. Although quantum mechanics governs atom-electron interactions, many physical and chemical phenomena occur at larger lengths- and time scales than atomic motion.

Bridging these scales requires innovative approaches that capture quantum interactions accurately and parallelizable architectures that can run on exascale computers. However, current computational methods cannot investigate the realistic complexity of physical and chemical systems, and the observable evolution of these systems occurs over timescales beyond the scope of atomistic simulations.

Bridging the Accuracy-Speed Tradeoff of Atomistic Simulations through Allegro Architecture

Thus, it paved the way for the researchers to The Allegro architecture accurately describes dynamics in highly complex structures at quantum fidelity, thereby bridging the accuracy-speed tradeoff of atomistic simulations.

The authors demonstrate the scalability of Allegro by performing stable simulations of protein dynamics for nanoseconds and scaling up to a 44-million atom structure of a complete, all-atom, explicitly solvated HIV capsid on the Perlmutter supercomputer. They also show excellent strong scaling up to 100 million atoms and 70% weak scaling to 5120 A100 GPUs. This work has important implications for the field of molecular dynamics simulations and has the potential to facilitate breakthroughs in the study of complex biomolecules.

Challenges in bridging the gap between small-scale models and complex materials dynamics

This gap between fundamental questions and the effective modelling of phenomena has been a persistent challenge for decades. Computational modelling can be divided into two sides of a gap. On one side, highly-fidelity and computationally expensive models such as electronic structure methods of density functional theory (DFT) and wave-function quantum chemistry can be utilized to construct and investigate small-sized models of essential parts of a system, which are useful in materials science. However, these models have limitations when capturing structures’ evolution over relevant time scales.

Alternatively, simpler analytical models have been used to describe the dynamics of complex inorganic and biological materials. Still, they have been found to have many documented failures in accurately capturing interatomic interactions.

Molecular Dynamics: A Powerful Tool for Designing and Advancing Novel Molecules and Materials.

Molecular dynamics (MD) simulations play a crucial role in computational science by providing insights into the dynamics of molecules and materials at the atomic scale. They offer a level of resolution, understanding, and control often impossible with experiments, making them a powerful tool for designing and advancing novel molecules and materials.

MD involves simulating the time evolution of atoms based on Newton’s equations of motion and obtaining a sequence of many-atom configurations by integrating the forces at each time step.

Physical observables can then be derived from these configurations. However, the short integration time step required, typically on the order of femtoseconds, is a bottleneck for MD. On the other hand, quantum-mechanical simulations provide a highly accurate description of the electronic structure of molecules and materials.

State-of-the-art Equivariant Models for Molecular Dynamics Simulations.

Advancements in equivariant models have significantly improved interatomic interaction models’ accuracy and surpassed previous empirical and machine learning potentials. The authors of this work have utilized leadership GPU computing to connect the highest accuracy achievable in interatomic interaction models with extreme scalability.

This breakthrough establishes a new state-of-the-art for molecular dynamics simulations and enables researchers to simulate previously inaccessible systems. Large molecular dynamics simulations of complex biological systems have been achieved entirely using machine learning interatomic potentials at quantum accuracy for the first time. The Allegro architecture used in this work can simulate the dynamics of any atomistic structure, including polycrystalline and multiphase composites, diffusion in glasses, polymerization, and catalytic reactions.

The rapid adoption of equivariant interatomic potential models in the research community is evidence of the wide impact of this approach. The combined use of the PyTorch and Kokkos performance portable libraries allows deployment of this state-of-the-art equivariant model architecture on various hardware computing architectures, including CPUs, NVIDIA, AMD, and Intel GPUs that power leadership-class resources.

The Future of Allegro: Scaling Up for Exascale Simulations

In the future, it is expected that Allegro will be deployed on even larger computing resources, enabling it to achieve greater scalability than demonstrated in this study. The high-capacity equivariant Allegro models used in this work were shown to accurately learn forces across the entire SPICE dataset, which contains over 1 million structures of drug-like molecules and peptides.

This implies the potential to learn the entire sets of inorganic materials and organic molecules with unprecedented accuracy, opening up the possibility of fast exascale simulations of a wide range of materials systems.

Read the full research paper here.