Molecular dynamics simulations are crucial for understanding the behaviour of materials, yet remain computationally demanding. Researchers Nicolaï Gouraud from Qubit Pharmaceuticals, Côme Cattin and Thomas Plé from Sorbonne Université, and colleagues have developed a novel approach, termed DMTS-NC, to significantly accelerate these simulations utilising neural network potentials. This work, a continuation of previous investigations, introduces a distilled multi-time-stepping strategy employing non-conservative forces, achieved through a collaborative effort between Qubit Pharmaceuticals and Sorbonne Université. The DMTS-NC scheme demonstrates enhanced stability and efficiency compared to conservative methods, offering speedups of 15-30% and crucially requiring no finetuning. By enforcing physical priors within the distilled potential, the team maintains accuracy while pushing the boundaries of simulation efficiency, representing a substantial advancement applicable to a wide range of potential systems.
A new technique for accelerating molecular dynamics simulations promises to accelerate materials discovery and drug design by enabling more detailed investigations of complex systems. The approach bypasses longstanding computational bottlenecks, offering a significant advantage for modelling atomic behaviour.
Researchers propose the DMTS-NC approach, a distilled multi-time-step (DMTS) strategy utilising non-conservative (NC) forces to accelerate atomistic molecular dynamics simulations with foundation neural network models. Their work builds upon previous research published in J. Phys. Chem. Lett., 2026, 17, 5, 1288, 1295.
A dual-level reversible reference system propagator algorithm (RESPA) formalism couples a target accurate conservative potential to a simplified distilled representation optimised for generating non-conservative forces. Despite employing non-conservative forces, the distilled architecture enforces key physical priors, including equivariance under rotation and atom cancellation.
Distilling neural network forces accelerates molecular dynamics via multi-time stepping
Scientists have developed a distilled multi-time-step scheme with non-conservative forces, denoted DMTS-NC, to accelerate molecular dynamics simulations based on neural network potentials. This approach builds upon previous work utilising the FeNNix-Bio1(M) foundation model and the FeNNol library, aiming to improve simulation speed and robustness without compromising accuracy.
The strategy involves training a small, fast-to-evaluate model on data labelled by the larger FeNNix-Bio1(M) model, rather than directly from Density Functional Theory. This small model is then applied iteratively within an inner loop of a multi-time-step procedure, with corrections applied using the difference between the large and small force models in an external loop.
The use of non-conservative forces, bypassing the computationally expensive backpropagation step required for conservative forces, is a key feature of DMTS-NC. Bigi et al previously demonstrated the potential of non-conservative forces within a multi-time-step scheme, inspiring this current work. Distillation reduces computational cost during the internal steps by removing the constraint of deriving forces from a potential energy, and accelerating both training and evaluation as differentiation procedures are no longer required.
The model architecture is a modification of FeNNix-Bio1, designed to directly predict forces instead of energies. While Newton’s third law is not inherently guaranteed, the architecture incorporates physical priors, including equivariance under rotation and cancellation of atomic force components. Initially, each atom is embedded in a Nf-dimensional vector space representing electronic structure and charge information.
This embedding is updated using two message-passing layers of an equivariant transformer, the first focusing on short-range geometry within a 3.5Å cutoff, and the second incorporating both short-range and medium-range messages up to 7.5Å. Atomic energies are then obtained using mixture-of-experts multi-layer perceptrons, with routing dependent on the chemical group of each atom.
An explicit screened nuclear repulsion term is also added. The non-conservative model shares a similar embedding architecture, but considers only short-range messages in both layers, focusing on local structure and high frequencies. The description of each atom i includes a scalar embedding xi ∈ RNf and a tensorial embedding Vi ∈ Rnl×(λmax+1)2, representing geometric tensors up to order λmax. ns scalar and nl tensor attention heads are used, with vectors qih and kih formed via linear projections of the scalar embedding xi: qih = Wqhxi and kih = Wkhxi, where Wqh and Wkh are optimised weight matrices.
The neighbour list N(i) defines all atoms j ≠ i within a distance rij ≤ 3.5Å. Scaled dot products cijh are calculated for each head h and neighbour j: cijh = 1/√nc Σk=1 nc qihkkjhk. A three-dimensional vector Fij is then constructed, combining radial basis vectors Bh(rij) and the vector part of the tensor embedding Vih, with a polynomial cutoff function fc(rij) applied. The final force vector acting on atom i is given by Fi = Σj∈N(i) (Fij − Fji) + Frep,i, where Frep,i = −∇Erep is the short-range screened nuclear repulsion term, utilising the element-pair-specific NLH parametrization.
Improved accuracy and efficiency through non-conservative neural network architecture
The non-conservative force model achieved a mean absolute error of 1.46kcal/mol and a root mean square error of 2.33kcal/mol when fitting force vectors produced by the larger FeNNix-Bio1(M) neural network. These error values represent a substantial improvement over a smaller conservative model, which previously yielded a mean absolute error of 3.44kcal/mol and a root mean square error of 5.53kcal/mol.
The distilled model’s performance demonstrates its ability to accurately reproduce forces across a diverse chemical space, utilising a training dataset comprised of small organic molecules and biologically relevant complexes. The non-conservative neural network comprises a single layer of 256 neurons, contrasted with 64 neurons in the conservative model.
Attention mechanisms within the message-passing phase employ 16 scalar attention heads in FeNNix-Bio1(M) but only 4 in the non-conservative model. Furthermore, the polynomial function used to describe interatomic interactions is of order 8 in FeNNix-Bio1(M) and reduced to order 5 in the non-conservative model. Consequently, the non-conservative model contains a total of 286,736 parameters, a dramatic reduction from the 9,526,855 parameters present in FeNNix-Bio1(M).
Training of the non-conservative force model required 2200 epochs, utilising 1000 batches of 128 conformations per epoch. The Muon optimizer was employed with an initial learning rate of 1.0 × 10−5, linearly increasing to 5.0 × 10−4 before decreasing to a final rate of 1.0 × 10−6 following a cosine one-cycle schedule. The entire training process, conducted on a Nvidia A100 40GB GPU, consumed 28 hours and 47 minutes. Hydrogen Mass Repartitioning, increasing the mass of hydrogen atoms by 3.0 Daltons and redistributing the deficit, further enhances simulation stability and allows for larger time steps.
Accelerating molecular dynamics via foundation models and controlled imprecision
Scientists are increasingly reliant on molecular dynamics simulations to understand and predict the behaviour of materials, from drug candidates to novel polymers. However, these simulations are computationally expensive, often requiring supercomputers and limiting the size and timescales that can be realistically explored. This new work offers a significant step towards overcoming that bottleneck, not through brute force hardware upgrades, but through clever algorithmic design.
The distilled multi-time-step strategy, leveraging foundation models and deliberately ‘non-conservative’ forces, represents a departure from traditional approaches that prioritise strict physical fidelity. The willingness to trade some theoretical purity for practical gain is particularly notable. For years, the field has been fixated on reproducing quantum mechanical accuracy in every component of a simulation.
This research suggests that a degree of approximation, intelligently implemented and guided by fundamental physical principles like equivariance, can actually improve stability and speed. The reported 15-30% performance increase is substantial, but the real promise lies in the potential for further optimisation. Limitations remain, as the success of this method hinges on the quality of the ‘distilled’ representation, the simplified model used to generate non-conservative forces.
While the authors demonstrate excellent agreement with existing data, the generalizability of this approach to entirely new chemical systems needs further investigation. Moreover, the long-term effects of these approximations on simulation accuracy require careful scrutiny. Future work will likely focus on refining the distillation process, exploring different foundation models, and developing robust validation protocols. Ultimately, this could pave the way for simulations of unprecedented scale and complexity, unlocking new insights into the molecular world.
👉 More information
🗞 Faster Molecular Dynamics with Neural Network Potentials via Distilled Multiple Time-Stepping and Non-Conservative Forces
🧠 ArXiv: https://arxiv.org/abs/2602.14975
