Machine learning (ML) has become a crucial tool in quantum chemistry (QC), aiding in the fast and accurate calculation of various properties such as excitation energies. The use of ML in QC has been propelled by advancements in computer hardware and the development of supervised and unsupervised learning approaches. A recent study has introduced optimized multifidelity machine learning (oMFML), a method that combines various ML-like submodels to provide superior prediction capabilities and lower error of prediction. However, challenges such as the need for large and costly training sets and the development of more complex molecular descriptors remain.
What is the Role of Machine Learning in Quantum Chemistry?
Machine learning (ML) has become an essential tool in quantum chemistry (QC), providing fast and accurate calculations for various properties of interest such as excitation energies. The use of ML in QC has been accelerated by improvements in computer hardware and the development of both supervised and unsupervised learning approaches. These applications have been used in areas such as material design and discovery, excitation energies, potential energy surfaces, and even the prediction of chemical reactions.
The core principle of these ML techniques is to reproduce an implicit mapping between the geometry of the molecules and some property of interest. These properties could include atomization or excitation energies, or even complete potential energy surfaces. The aim is to target these quantities at some level of theory relevant to the area of application.
The general ML-QC pipeline for such applications begins with the generation of raw data. This data consists of the Cartesian geometries of the molecules of interest and the QC calculation property to be predicted at a specific level of theory. The Cartesian coordinates are then transformed into some input feature format, known as representations or molecular descriptors, that the ML models can map to the property of interest.
How Does Multifidelity Machine Learning Improve Quantum Chemistry Calculations?
High accuracy in prediction using an ML model often demands a large and costly training set. Various solutions and procedures have been presented to reduce this cost, including methods such as ML hierarchical, ML, and multifidelity machine learning (MFML). MFML combines various ML-like submodels for various fidelities according to a fixed scheme derived from the sparse grid combination technique.
In a recent study, an optimization procedure was implemented to combine multifidelity models in a flexible scheme, resulting in optimized MFML (oMFML) that provides superior prediction capabilities. This hyperparameter optimization was carried out on a holdout validation set of the property of interest.
The oMFML method was benchmarked in predicting the atomization energies on the QM7b dataset and again in the prediction of excitation energies for three molecules of growing size. The results indicated that oMFML is a strong methodological improvement over MFML and provides lower error of prediction. Even in cases of poor data distributions and lack of clear hierarchies among the fidelities, the oMFML was advantageous for the prediction of quantum chemical properties.
What are the Key Concepts in Machine Learning for Quantum Chemistry?
One of the key concepts in ML for QC is the use of representations or molecular descriptors. These are input feature formats that the ML models can map to the property of interest. In recent years, much work has been dedicated to the development of such representations. These include molecule-wise descriptors, which encode the entire molecule, such as inverse distance representations and their extensions such as the Coulomb Matrix (CM) and Bag of Bonds.
Another key concept is the use of various ML-like submodels in MFML. These submodels are combined according to a fixed scheme derived from the sparse grid combination technique. This allows for a more efficient use of the training set and reduces the cost of achieving high accuracy in prediction.
The third key concept is the use of hyperparameter optimization in oMFML. This process is carried out on a holdout validation set of the property of interest and allows for the combination of multifidelity models in a flexible scheme. This results in superior prediction capabilities and lower error of prediction.
Who are the Key Players in Machine Learning for Quantum Chemistry?
The study on optimized multifidelity machine learning for quantum chemistry was conducted by Vivin Vinod, Ulrich Kleinekathöfer, and Peter Zaspel. Vinod and Zaspel are from the School of Mathematics and Natural Science at the University of Wuppertal in Germany. Kleinekathöfer is from the School of Science at Constructor University in Bremen, Germany.
These researchers have contributed to the field of quantum chemistry by implementing an optimization procedure to combine multifidelity models in a flexible scheme. This has resulted in optimized MFML (oMFML) that provides superior prediction capabilities. Their work has shown that oMFML is a strong methodological improvement over MFML and provides lower error of prediction.
What is the Future of Machine Learning in Quantum Chemistry?
The future of ML in QC looks promising, with the development of methods such as oMFML providing superior prediction capabilities and lower error of prediction. The use of hyperparameter optimization in oMFML allows for the combination of multifidelity models in a flexible scheme, which could lead to further improvements in prediction accuracy.
However, challenges remain, such as the need for large and costly training sets to achieve high accuracy in prediction. Further research is needed to develop methods that can reduce this cost and make ML models more efficient.
In addition, while the use of representations or molecular descriptors has been successful in mapping the geometry of molecules to properties of interest, more work is needed to develop representations that can capture more complex molecular properties. This could lead to the development of ML models that can predict a wider range of quantum chemical properties.
Publication details: “Optimized Multifidelity Machine Learning for Quantum Chemistry”
Publication Date: 2024-02-26
Authors: V.V. Vinod, Ulrich Kleinekathöfer and Peter Zaspel
Source: Machine Learning: Science and Technology
DOI: https://doi.org/10.1088/2632-2153/ad2cef
