Scientists at Delft University of Technology, have unveiled a new reinforcement learning approach to optimise circuit routing for distributed quantum computing, a paradigm where qubits are distributed across multiple interconnected quantum processor modules. Joost Van Veen and colleagues have developed a method that minimises circuit execution time through efficient qubit placement and the utilisation of remote state generation. The approach delivers a relative reduction of up to 35% compared to existing methods, representing a significant advancement towards realising practical and scalable distributed quantum computers.
Reinforcement learning optimises qubit routing for a sharp speedup in distributed quantum
A 35% reduction in modelled execution time for quantum circuits signifies a substantial improvement over previous methods in the field of quantum computation. Traditionally, scaling quantum processors has followed a monolithic approach, attempting to integrate ever more qubits onto a single chip. However, this faces fundamental limitations in fabrication, control complexity, and signal integrity. Distributed quantum computing (DQC) offers a viable alternative, partitioning the quantum processor into smaller, manageable modules interconnected by a quantum network. In such an architecture, the challenge of quantum circuit compilation shifts from intra-module qubit placement and routing to inter-module qubit allocation and communication. This new method directly addresses this challenge. Training reinforcement learning agents for distributed quantum systems previously demanded prohibitive computational resources, making practical optimisation impossible, but this performance leap crosses a key threshold, opening avenues for real-world implementation. The new agent, developed by researchers at TU Delft and Delft University of Technology, employs a refined approach to qubit placement and routing across these interconnected quantum processor modules, effectively managing the complexities introduced by distribution.
Optimising the complex task of managing qubit communication and operations now paves the way for more efficient and scalable distributed quantum computers. The agent achieved the 35 percent reduction in modelled execution time for quantum circuits with 30 percent less wall-clock training time than baseline methods. This efficiency gain is crucial, as the computational cost of training reinforcement learning agents can be substantial. A new action-space formulation allows for exploration of more efficient qubit placement and routing strategies; rather than considering all possible qubit assignments, the agent intelligently focuses on promising configurations. Simultaneously, action-masking strategies effectively filter out invalid or unproductive actions during training, preventing the agent from wasting time exploring non-physical or detrimental routing paths. These modules are interconnected units housing qubits, and the agent also utilises efficient Q-value approximation to accelerate the learning process. Q-values represent the expected cumulative reward for taking a specific action in a given state, and accurate estimation of these values is vital for effective reinforcement learning. Benchmarks, conducted using randomly generated quantum circuits of varying complexity, ranging from simple circuits with a few gates to more intricate ones with dozens, were tested on hardware graphs with differing connectivity, representing the physical layout of qubits and their connections. These hardware graphs simulated various architectures, including linear chains, 2D grids, and more complex topologies, ensuring the robustness of the approach across different quantum processor designs.
Reinforcement learning optimises qubit routing and reduces execution time in distributed quantum
Scaling quantum computers demands increasingly clever ways to manage information flow, and this research offers a promising step forward. The inherent challenge lies in the fact that qubits are fragile and susceptible to decoherence, the loss of quantum information, and any operation introduces a degree of error. Efficient routing minimises the number of operations required to execute a circuit, thereby reducing the overall error rate and improving the reliability of the computation. Real quantum devices introduce noise, which could limit the benefits of optimised routing, yet this work establishes a strong foundation for managing complexity in future quantum systems. The impact of noise will require further investigation and potentially the incorporation of noise-aware routing strategies into the reinforcement learning framework. Optimising distributed quantum computing requires new methods for managing qubit interactions across multiple processing units. Scientists refined how the agent explores potential routing options and limited it to feasible actions, establishing a foundation for more complex algorithms capable of handling larger quantum circuits and real-world hardware limitations. The current work focuses on optimising execution time, but future research could extend this to include other important metrics, such as energy consumption and qubit coherence time. However, reliance on numerical comparisons alone leaves a vital question unanswered: how well will this approach translate to actual quantum hardware, where the insidious effects of noise and decoherence rapidly degrade qubit states. Further validation on physical quantum processors is essential to confirm the practical benefits of this reinforcement learning approach and to assess its resilience to real-world imperfections. The 35% improvement, while significant in simulation, needs to be demonstrated in a tangible quantum system to fully realise its potential.
This research demonstrated a 35% reduction in modeled execution time for quantum circuits using a refined reinforcement learning agent. This matters because building larger quantum computers is proving difficult, and distributing qubits across multiple smaller processors offers a potential solution. The agent improves the efficiency of routing information between these processors, optimising how quantum operations are performed across the system. Researchers achieved this by improving how the agent explores routing options and ensuring it only considers feasible actions, building a foundation for managing more complex quantum circuits.
👉 More information
🗞 Rethinking How to Act: Action-Space Engineering for Reinforcement Learning-Based Circuit Routing in Distributed Quantum Systems
🧠 ArXiv: https://arxiv.org/abs/2605.02389
