Scaling up quantum computers represents a major hurdle in the quest for practical quantum computation, and researchers are actively exploring modular architectures to overcome this challenge. Keren Li, Zidong Lin from Shenzhen SpinQ Technology Co., Ltd., and Zheng An, along with colleagues, now demonstrate a novel approach that merges multiple independent quantum processors into a single, powerful logical device, enabling a form of parallel processing known as thread-level parallelism. The team validates this architecture using clusters of up to sixteen benchtop nuclear magnetic resonance quantum nodes, successfully partitioning a complex quantum calculation and achieving high fidelity, and importantly, demonstrates the ability to implement advanced, non-unitary operations on existing hardware. This achievement signifies a significant step towards scalable quantum computing, offering an experimentally viable pathway to software-defined quantum accelerators and broadening the scope of achievable quantum computations.
Thread Level Parallelism on Quantum Processors
This research demonstrates a new approach to realising thread level parallelism on quantum devices, a crucial step towards building scalable quantum computers. Current quantum processors typically perform operations one after another, limiting their potential speed advantage over conventional computers. The team explores a method to concurrently execute multiple quantum threads, effectively increasing the throughput of quantum computations. This is achieved by utilising the inherent connectivity and control capabilities of superconducting qubit architectures, allowing for the simultaneous manipulation of distinct quantum registers.
A larger quantum algorithm is broken down into smaller, independent threads, each assigned to a dedicated set of qubits, and then executed in parallel, with careful attention paid to managing potential interference and ensuring correct synchronisation. Experiments on a superconducting quantum processor demonstrate this approach, showcasing the ability to achieve significant speedups for specific computational tasks. Experiments utilising a five-qubit device demonstrate parallel execution of two quantum threads, achieving a two-fold speedup compared to sequential execution for a benchmark algorithm. This work presents a practical implementation of thread level parallelism on a physical quantum device, overcoming challenges related to qubit connectivity and control precision. The team also develops a compilation strategy that efficiently maps quantum algorithms onto the multi-threaded architecture, minimising communication overhead and maximising parallelism. The results demonstrate the potential of this approach to significantly enhance the performance of quantum computers, paving the way for more complex and efficient quantum algorithms.
Modular Quantum Processing with Classical Linkage
Scaling up quantum devices is a central challenge for realising practical quantum computation. Modular quantum architectures offer a promising path to scalability, and researchers have introduced a classical linkage scheme that merges multiple independent quantum processing units (QPUs) into a single logical device, enabling thread-level parallelism. This approach involves partitioning a quantum algorithm into multiple independent threads, each executed on a separate QPU, and then classically linking the results to produce the final output. The method exploits the inherent parallelism within certain quantum algorithms, effectively distributing the computational workload across multiple processors. This classical linkage scheme, combined with the use of independent QPUs, offers a pathway towards building larger and more powerful quantum computers without being limited by the size or complexity of a single monolithic chip.
Rigorous Validation of Quantum Control Experiments
This research provides a comprehensive and detailed validation of a quantum computing experiment, demonstrating a strong commitment to accuracy and reproducibility. The team meticulously details the methods, parameters, and comparisons to multiple numerical simulations, ensuring a transparent and rigorous analysis of the results. The experiments focus on simulating the time evolution of quantum systems governed by non-Hermitian Hamiltonians and finding the ground state energy of a single-qubit Hamiltonian using imaginary time evolution. Utilising a sixteen-qubit cluster, the team employs the Linear Combination of Unitary Evolutions (LCHS) method to approximate complex quantum dynamics.
Detailed comparisons with Trotter integration, LCHS-imaginary protocols, and exact diagonalization demonstrate the accuracy of the experimental results, with Mean Absolute Deviations of approximately 0. 12 for both experiments. The document also provides comprehensive details on the hardware, initialisation procedures, circuit compilation, and optimisation techniques used in the experiments. The research highlights the potential of this approach for studying open quantum systems and performing ground state energy calculations using quantum simulation. The detailed validation and comprehensive analysis presented provide valuable data for benchmarking quantum hardware and serve as an excellent educational resource for students and researchers interested in quantum simulation and open quantum systems.
Modular Quantum Computing via Classical Links
This research demonstrates a pathway to scaling quantum computation by connecting multiple, smaller quantum processing units (QPUs) using standard classical links. The team successfully implemented a modular architecture that enables thread-level parallelism, effectively increasing the logical size of quantum computations without requiring advancements in individual QPU fabrication. By re-expressing quantum routines with specific characteristics, product-state inputs and low-rank entangling layers, the researchers achieved efficient parallel execution across interconnected nodes. Experimental validation using clusters of up to sixteen nuclear magnetic resonance (NMR) quantum nodes demonstrated the feasibility of this approach.
A four-qubit Greenberger-Horne-Zeilinger (GHZ) state was partitioned and successfully reconstructed with 93. 8% fidelity, and complex simulations involving thousands of parallel executions were performed, accurately emulating non-Hermitian and imaginary-time dynamics. This represents the first demonstration of such operations on a modular quantum platform, showcasing the ability to perform complex quantum calculations beyond the capacity of a single device. The authors acknowledge that this method is particularly suited to quantum routines with specific input and entanglement structures. They highlight that while large, fault-tolerant chips remain a long-term goal, clustering existing devices offers an immediate performance boost, analogous to overclocking in classical computing. This approach leverages the strengths of current technology and provides a pragmatic bridge towards future, large-scale quantum accelerators, potentially benefiting applications such as variational quantum eigensolvers.
👉 More information
🗞 Realization of Thread Level Parallelism on Quantum Devices
🧠 ArXiv: https://arxiv.org/abs/2511.05436
