AI Simplifies Molecular Analysis, Bridging Simulations and Real-World Experiments Seamlessly

Scientists are continually seeking to bridge the gap between molecular simulations and experimental vibrational spectroscopy, a challenge particularly acute for complex aqueous systems. Philipp Schienbein from Ruhr-Universit at Bochum, alongside colleagues including collaborators at the Research Center Chemical Sciences and Sustainability, and Research Alliance Ruhr, now present mimyria, a new framework designed to simplify and automate this process. Their work introduces a novel machine-learning target, the polarizability gradient tensor, for Raman spectroscopy, complementing existing methods for infrared spectroscopy, and crucially, demonstrates that accurate spectra can be generated from surprisingly small training datasets. This advance represents a significant step towards making computationally intensive vibrational spectroscopy more accessible and reliable for studying a wide range of condensed-phase phenomena.

Raman spectra from molecular dynamics trajectories were analysed within a unified workflow. We introduce the polarisability gradient tensor (PGT) as a novel atom-resolved machine-learning target property for Raman spectroscopy, complementing the established atomic polar tensor (APT) for IR spectroscopy. As a necessary prerequisite, we demonstrate how both PGTs and APTs can accurately be computed from electronic-structure theory, validating them across formally equivalent densities.

Atomistic polar tensor representation via a neural network approach

Scientists are increasingly employing machine learning to accelerate ab initio molecular dynamics simulations while maintaining ab initio quality. Most commonly used machine learning potentials do not automatically provide the electronic response functions required for the calculation of vibrational spectra.

Consequently, additional machine learning models must be trained or augmented to represent these response properties before infrared or Raman spectra can be generated. These challenges necessitate approaches that provide access to vibrational response functions at atomic resolution without requiring long ab initio trajectories.

Motivated by the dissective power of the atomic polar tensor, researchers previously introduced a machine learning model that directly represents atom-resolved atomic polar tensors, referred to as the atomic polar tensor neural network. Several works have now adopted the capability to represent atomic polar tensors, but in most cases these are obtained indirectly as derivatives of a learned total dipole moment.

In contrast, the central idea of the atomic polar tensor neural network is to learn atomic polar tensors directly, thereby avoiding the need to train a total dipole moment and consequently avoiding the non-unique decomposition of that global object into atomic contributions. Direct derivative learning exploits that the gradients are physically and gauge- and branch-invariant response quantities and therefore do not suffer from the multi-valuedness of, for instance, the dipole moment in periodic systems.

Researchers recently demonstrated that accurate infrared spectra of bulk liquid water can be produced by using training data obtained exclusively from finite gas-phase water clusters. In this setting, a total dipole moment cannot be meaningfully transferred between finite and periodic systems, whereas the atomic polar tensor, as a size-insensitive property, can be converged for the central atoms in a sufficiently large finite cluster and transferred to the periodic bulk environment.

Furthermore, atomic polar tensors have also recently been explored in the context of incorporating long-range electrostatics into machine learning potentials and incorporating external electric fields in machine learning molecular dynamics simulations. In the present work, scientists further demonstrate that closely related ideas can be extended to Raman spectroscopy, introducing the so-called “polarizability gradient tensor” as a machine learning target.

Obtaining statistically converged vibrational spectra typically requires several tens to hundreds of picoseconds of molecular dynamics trajectories. When explicit ab initio reference calculations are employed, such trajectory lengths constitute a substantial computational commitment, and even spectra with deliberately reduced statistical accuracy still rely on simulation times that are costly when using explicit electronic-structure calculations.

Consequently, statistically converged ab initio reference spectra are generally unavailable in practice, and even reference spectra with minimal statistical accuracy are difficult to obtain, in particular for large systems or when computationally demanding electronic-structure methods are required. This limitation becomes especially severe for system sizes beyond a few hundred to thousand atoms, where explicit ab initio calculations are effectively impractical.

The central question is therefore whether the accuracy of vibrational spectra can be inferred from the machine learning model itself, without computing statistically converged ab initio reference spectra. Researchers present a complete workflow that connects molecular dynamics trajectories to vibrational spectra.

The proposed software framework (“mimyria”) provides the necessary tools to train machine learning models for electronic response functions and to post-process molecular dynamics trajectories to obtain infrared and Raman spectra. The generation of training data, the training of response models, and the subsequent calculation of vibrational spectra are handled within a unified and largely automated workflow, requiring only minimal user intervention.

The machine learning models are intentionally designed in a modular fashion, such that they do not interfere with the machine learning potentials used to generate the molecular dynamics trajectories. As a result, models for infrared and Raman spectra can be trained and applied independently. This modularity offers significant practical flexibility, as numerous machine learning potentials have been trained on various different systems in recent years and can now be revisited to generate vibrational spectra without retraining the underlying interaction model.

For future projects it is, moreover, not necessary to decide at the outset of a project whether vibrational spectra will be required; the corresponding response models can be trained at a later stage when such analyses become relevant. While electronic response functions can, in principle, also be obtained as higher-order derivatives of the potential energy, this requires the underlying potential to be revisited and revalidated once vibrational spectra become relevant. Researchers demonstrate the applicability of mimyria by computing infrared and Raman spectra of aqueous systems, building on existing machine learning potential for liquid water.

Machine learning prediction of vibrational spectra using polarizability and polarizability gradient tensors

Mimyria, a new modular and automated framework, generates infrared and Raman spectra from molecular dynamics trajectories with high efficiency. The work introduces the polarizability gradient tensor as a novel machine-learning target for Raman spectroscopy, complementing the established polar tensor used for infrared spectroscopy.

Accurate calculation of both polarizability gradient tensors and polar tensors is demonstrated using electronic-structure theory, with validation performed across equivalent derivative formulations to benchmark numerical consistency. Machine learning models were employed as efficient surrogates to represent validated polar tensors and polarizability gradient tensors on aqueous benchmark systems.

Spectral convergence was achieved with surprisingly small training sets, indicating data efficiency in the approach. Spectral agreement improved more rapidly than the root-mean-square error, highlighting the effectiveness of the methodology in capturing essential spectral features. The research connects model-level errors to observable-level accuracy, providing practical guidelines and early-stopping criteria for achieving sufficient spectral fidelity.

By integrating response-tensor learning, automated training, and spectral-domain validation, mimyria enables quantitatively reliable vibrational spectroscopy. Calculations of all polar tensors and polarizability gradient tensors for a given configuration require a total of 13 single-point calculations, recycling computations previously used for obtaining polar tensors.

The automated training procedure utilizes a graph neural network scheme within the e3nn framework, employing node features representing chemical species and spherical-harmonic expansions for edge construction. The APTNN model uses (20, 10, 6, 5) channels for l = 0 to 3, while the PGTNN model employs (40, 20, 13, 10) channels, respectively.

Training involved a 200 picosecond NVT simulation, sampling 80 independent configurations to spawn 80 separate 20 picosecond NVE simulations. Using 200 training configurations and 100 test configurations, convergence was typically achieved with fewer samples than initially anticipated, demonstrating the efficiency of the automated training process.

Predicting vibrational spectra using machine learning and polarizability gradients

Scientists have developed mimyria, a new automated framework for performing vibrational spectroscopy on condensed-phase systems with improved efficiency and reliability. This framework integrates electronic-structure calculations, machine-learning models, and spectral analysis into a unified workflow, addressing a longstanding limitation in the field.

A key innovation is the introduction of the polarizability gradient tensor as a machine-learning target for Raman spectroscopy, complementing existing methods for infrared spectroscopy. Mimyria employs machine learning to predict vibrational spectra from molecular dynamics simulations, significantly reducing the computational cost compared to traditional methods.

Validation against high-accuracy calculations demonstrates that accurate spectra can be obtained with relatively small training datasets, and the framework includes practical guidelines for determining when sufficient spectral fidelity has been achieved. The methodology was successfully applied to aqueous sulfate ions, accurately reproducing both infrared and Raman spectra, including polarization-dependent Raman responses and atom-resolved spectral signals, in agreement with experimental data.

The authors acknowledge that the electronic structure calculations required for the polarizability gradient tensor are computationally more expensive than those for the infrared spectra, increasing the overall cost by approximately a factor of two. However, they note that some calculations can be shared between the two approaches, mitigating this increased expense. Future research could focus on further optimising the computational efficiency of the electronic structure calculations or exploring the application of mimyria to more complex systems, potentially broadening its utility across diverse scientific disciplines.

👉 More information
🗞 Mimyria: Machine learned vibrational spectroscopy for aqueous systems made simple
🧠 ArXiv: https://arxiv.org/abs/2602.06760

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

One-Dimensional Magnetism Unlocked in Novel Material Could Boost Quantum Technology

One-Dimensional Magnetism Unlocked in Novel Material Could Boost Quantum Technology

February 9, 2026
Superconducting Quantum Dot Reveals Exotic Electron Behaviour and Potential for New Devices

Superconducting Quantum Dot Reveals Exotic Electron Behaviour and Potential for New Devices

February 9, 2026
Superconducting Material Revived by Pressure Could Unlock Lossless Power Transmission

Superconducting Material Revived by Pressure Could Unlock Lossless Power Transmission

February 9, 2026