Quantum Computing and AI Revolutionise Life Sciences, Tackling Biological Complexity

Quantum Computing And Ai Revolutionise Life Sciences, Tackling Biological Complexity

Life sciences are increasingly using artificial intelligence (AI) and quantum computing to understand the complexity of living organisms. Researchers Alexey Pyrkov, Alex Aliper, Dmitry Bezrukov, Dmitriy Podolskiy, Feng Ren, and Alex Zhavoronkov are developing a theoretical framework to integrate quantum computing into the study of biological complexity. They believe that quantum computing, with its unprecedented computational power, could be the next significant leap in technology for life sciences.

Companies like IBM and D-Wave are already releasing quantum processors, indicating the commercial potential of this field. The researchers anticipate that the synergy of AI, quantum computing, and complex systems physics could lead to significant advancements in life sciences.

Quantum Computing and AI: A New Era for Life Sciences

The field of life sciences has made significant strides in understanding living organisms at various levels, such as genes, cells, molecules, tissues, and pathways. Now, the focus is shifting towards integrating these components to comprehend their collective behavior. This shift necessitates a general conceptual framework for understanding complexity in life sciences, a transition being facilitated by large-scale data collection, unprecedented computational power, and new analytical tools. In recent years, life sciences have been revolutionized with AI methods and quantum computing is touted to be the next most significant leap in technology. This article provides a theoretical framework to orient researchers around key concepts of how quantum computing can be integrated into the study of the hierarchical complexity of living organisms and discusses recent advances in quantum computing for life sciences.

Understanding Complexity in Life Sciences

Complexity is a characteristic of very diverse real-world systems, such as ecosystems, economies, traffic systems, and the Internet. Biological systems display a level of hierarchical complexity that is unique and inaccessible to other the most complex inanimate objects. Living systems share certain key features, including networks of interacting elements, a hierarchical modular structure, nonlinear dynamics, and emergent properties that cannot be fully explained by examining individual components. There are many frameworks and methods for modeling complex systems depending on the specific questions being asked and the nature of the system being studied. However, none of them can capture the complexity of living organisms in a fully consistent way and many directions for their development are currently explored.

Paradigms of Modeling in Life Sciences

In the scientific community and in particular, in life sciences, there are two distinct paradigms to modeling: first principles theory modeling and data-driven modeling. Until recently, life sciences have been driven mostly by a first principles theory modeling paradigm. This paradigm consists of observing biological phenomena, suggesting hypotheses as to why this phenomenon appears and how it works, constructing experiments to verify the hypotheses, and developing approximate mathematical models from first principles to explain them.

Nevertheless, it is often only possible to observe a fraction of the biology, and due to an incomplete understanding of biology on various levels ranging from omics to entire organisms, only a small portion of the underlying phenomena can be modeled accurately. To make the simulation of phenomena computationally tractable, it is necessary to make additional assumptions and approximations, which result in further loss of biological and physical accuracy.

In contrast, a new data-driven modeling paradigm has emerged in recent years that relies solely on data and advanced computing capabilities. It manifests that data contains all the biology and physics driving a particular process and hence even without any knowledge of the governing biological and physical laws, it is possible to create models for this process. This paradigm is currently gaining significant attention because of the substantial availability of unprecedented amounts of data, state-of-the-art machine learning and data analytics libraries that are openly accessible, and high-performance computational resources.

Challenges in Modeling Complexity in Life Sciences

First, effective modeling tools must also consider the scaling behavior of living systems from individual cells to entire organisms. It is frequently necessary to apply distinct solvers at each scale, and exchanging information between them is critical for capturing self-consistent transitions between scales. This problem is referred to as a scaling complexity problem.

The balance between interpretability, transfer between different scales, precision, and speed can be achieved by combining the fundamental principles-based modeling (quantum and classical) and the data-driven modeling that can pave the way for a significant breakthrough for various applications. By incorporating essential physical and biological features such as symmetry, conservation, stability, resilience, and so on into data-driven models, their predictive capabilities can be greatly enhanced.

Second, algorithmic complexity, many traditional techniques in life sciences at each scale have a superpolynomial complexity. This is because biological systems are often highly complex and involve many interacting components, which can result in exponential or super-exponential growth in the number of possible states that the system can take on. For example, modeling the behavior of a network of neurons in the brain can require simulating the activity of thousands or even millions of individual cells, each with its own complex set of inputs and outputs. Other examples of problems with super-polynomial algorithmic complexity in life sciences include genome assembly, multiple sequence alignment, and phylogenetic tree reconstruction. These problems require sophisticated algorithms that can handle large and complex space of states and huge data sets, and often involve trade-offs between computational efficiency and accuracy. As a result, many modeling methods in life sciences rely on computational techniques such as machine learning, neural networks, and other forms of artificial intelligence that can efficiently process large amounts of data and identify patterns and relationships that would be difficult to detect using traditional methods.

The third aspect of complexity in life sciences is data complexity, the problem of big data in some cases and small data in other cases, which limits the effectiveness of modeling. In this context, the emergence of high-throughput genomic techniques has transformed the field of biology, particularly genomics and proteomics, into information science. For instance, a single-sequenced human genome amounts to approximately 140 gigabytes. Moreover, as stated in 2021, the 1000 Genomes project, which entails sequencing and documenting human genetic variation, has contributed twice as much raw data to GenBank in its initial 6 months compared to the accumulation of all previous sequences over the past 30 years. The UK Biobank has recently made available full genome sequences from all 500,000 British volunteers in its database. The Human Genome Project generated a massive amount of genomic data (it is estimated 40 exabytes to store the genome- sequence data generated worldwide by 2025), the International Cancer Genome Consortium (ICGC) produced large-scale genomic data sets for various types of cancer. Similarly, imaging technologies like magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT) produce large volumes of data that can be used to study the structure and function of tissues and organs. These data sets are often very large and complex, presenting challenges for analysis and interpretation. On the other hand, due to a variety of factors, such as limited availability or high costs of data collection, strict experimental designs, or restrictions on data sharing due to privacy concerns, many datasets in life sciences are relatively small in size in comparison with the number of parameters needed for their description. Small data ubiquitously appears in the early stages of research, in personal medicine, rare disease research, and some clinical trials and new methods of data processing providing more generalization from small data are required. This can make it difficult to model and predict the behavior of biological systems using traditional statistical methods.

Quantum Computing and AI: A Synergy for Life Sciences

ML and AI are currently revolutionizing many areas of life sciences from bioinformatics to drug discovery and clinical research. However, as with any data-driven approach it faces challenges in generalization, processing small data, and requiring huge resources for training. In living systems where the computational domain consists of a set of nested hierarchical models, AI methods work well on each of the hierarchical scales but cannot capture the whole picture. Combining AI data-driven approaches with quantum computing/quantum machine learning and methods from the physics of complex systems such as introducing the order parameter may pave the way for addressing the problem of complexity in life sciences.

With recent technological advancements, quantum computing is no longer confined to academic research but has opened doors for commercial opportunities and can provide possibilities for developing a framework that will capture the complexity of life sciences. Although its current scale may not match up to classical technologies, there is immense anticipation for the future potential of this field. IBM recently released the 1121 superconducting qubit quantum processor arranged in a honeycomb pattern.

D-Wave designed their 5000 qubit quantum annealer for business. The neutral-atom quantum computer with 280 physical qubits, 99.5% two-qubit gate fidelities, arbitrary connectivity, as well as fully programmable single-qubit rotations and mid-circuit readout was recently realized. It has been pointed out from an algorithmic perspective that current quantum algorithms exhibit promise in the fields of drug discovery and biology. Specifically, these algorithms are capable of processing ground state calculations and linear systems at a faster rate, which enables more precise predictions of drug-receptor interactions and protein folding. Additionally, advancements in quantum machine learning techniques can enhance classical AI techniques for generative chemistry. This indicates significant potential for synergy between AI, quantum computing, and the physics of complex systems when each of the approaches benefits the other and provides future progress in these areas.

Read more from the publication in Computational Molecular Science.