Quantum Computing And Protein Structure Prediction. Could Outperform Deep Learning Methods

Quantum Computing Revolutionises Protein Structure Prediction, Outperforms Deep Learning Methods

IBM Quantum and the Center for Computational Life Sciences are exploring using quantum computers for protein structure prediction, a challenging problem in biomedical research. Despite advancements in deep learning methods like AlphaFold2, predicting protein structures remains difficult. The researchers propose a framework for selecting protein structure prediction problems that could benefit from quantum computing. They demonstrated the concept by accurately predicting the structure of a Zika Virus protein component using quantum hardware. This work could potentially enhance our understanding of diseases at the molecular level and aid in drug development.

Quantum Computing and Protein Structure Prediction

Protein structure prediction (PSP) is a complex problem in biomedical research. Despite recent advancements in deep learning methods such as AlphaFold2, the challenge remains. With the rapid evolution of quantum computing, researchers are exploring whether quantum computers can offer meaningful benefits for approaching this problem. However, identifying specific problem instances amenable to quantum advantage and estimating the quantum resources required are equally challenging tasks.

The Protein Folding Problem

Proteins, the orchestrators of life at the molecular level, adopt three-dimensional conformations that are predetermined from their primary amino acid sequence. This phenomenon, known as the “protein folding problem”, is central to all life and its myriad of diseases. Predicting the optimal structure of a protein, without necessarily reproducing the optimal path, is arguably most important and more attainable. This is known as protein structure prediction (PSP).

Traditionally, protein structures have been determined through laborious wet lab experiments involving genetic modifications, protein isolation, and purification. Techniques like X-ray crystallography, NMR, and CryoEM have been instrumental in solving protein structures, revolutionizing our understanding of diseases. However, these methods are time-consuming, expensive, and not without limitations. Recognizing the need for alternatives, researchers turned to machine learning, exemplified by AlphaFold2, RoseTTaFold, and ITASSER, which leverage experimentally determined structures.

Quantum Computing: A Brief Introduction

Quantum Computing is a new model of high-performance computing where the traditional foundation of computing, i.e., binary logic, has been replaced by theories of quantum mechanics. The power of quantum computing comes from mechanical effects such as superposition, entanglement, negative state probability i.e. interference, and probabilistic measurement. These phenomena sometimes allow a quantum algorithm to naturally map the degrees of freedom of quantum hardware to those of a target quantum system and simulate it efficiently.

Quantum Search and Optimization Algorithms

Finding the global minimum energy in the folding funnel can be understood as a search problem where the database entries are conformational energies. The funnel shape indicates that despite the overall “easy-to-follow” macro structure, the ruggedness on the funnel wall and the bottom induces hardness to conformation prediction. The ruggedness of the search or optimization landscape creates amenability for quantum advantage.

What Makes Protein Structure Prediction Hard?

Regardless of the physics-based computational method used, sequence length quickly becomes a major limitation. As a protein sequence becomes larger, the search space (the number of possible conformations) increases exponentially, and the required run time for an exhaustive search also grows exponentially. Most nonmolecular dynamics (non-MD) PSP methods (ab initio or free modeling methods) have generally been limited to structures of only a few dozen amino acids in length.

One of the main advantages of the template-based deep learning methods like AlphaFold2 and RoseTTaFold is the sheer size of the structures they are able to produce. They are not limited to a few dozen residues as we’ve seen with the physics-based methods. Both AlphaFold2 and RoseTTaFold can readily produce models up to a couple thousand residues in length, in part due to the fact that their databases (PDB: https://www.rcsb.org/) include experimentally determined structures which span across this size range.

Despite this impressive capability, success still depends on how accurate the models are. Specifically, the models are known to produce discrepancies when dealing with proteins with a) mutated sequences and b) intrinsically disordered regions. Here we discuss the effect of these two factors.

The Future of Structural Biology

The future of structural biology depends profoundly on developing and scaling up efficient physics-based PSP methods. Quantum computing’s potential to simulate nature’s fundamental mechanics may help overcome the historical barriers. This will not have to be mutually exclusive, but could be rather synergistic, with quantum PSP methods helping complement the strengths of classical template-based methods, and vice versa. Multidisciplinary collaboration among biophysics, chemistry, computer science (both classical and quantum), and structural biology can help unravel the remaining mysteries. For the first time in decades, elucidating how and why proteins fold appears within reach, presenting a milestone that could deeply advance our comprehension of the subtle intricacies enabling life.