Reinforcement Learning Optimizes Entangling Gate Sequences for Parameterized Quantum Circuits, Reducing Gate Count

The challenge of building practical quantum computers centres on overcoming the inherent noise present in current devices, particularly within the operations that link quantum bits together. Tom R. Rieckmann, Stefan Scheel, and A. Douglas K. Plato, all from the Institute for Physics at the University of Rostock, now present a method to significantly improve the performance of quantum circuits by intelligently optimising the sequence of these linking operations. Their work demonstrates a reinforcement learning algorithm that designs more efficient circuits for preparing quantum states, achieving higher fidelity with fewer gates than standard approaches. This advancement addresses a critical limitation in quantum computing, allowing researchers to make the most of existing hardware and paving the way for more complex and reliable quantum computations.

Reinforcement Learning Optimizes Variational Quantum Circuits

This study pioneers a reinforcement learning algorithm to optimize quantum circuits for state preparation, addressing limitations imposed by noise in current quantum computing devices. Researchers focused on minimizing the number of entangling gates, a primary source of error, while maintaining circuit depth to suit systems with limited coherence times. This work extends previous approaches by incorporating general single-qubit operations, consistently achieving higher state preparation fidelities with the same number of gates compared to standard hardware-efficient designs. The team engineered a system where a reinforcement learning agent learns to optimize the sequence of entangling gates within a parameterized universal gate set.

This agent was trained to maximize fidelity when preparing a target quantum state, starting from a defined initial state. Crucially, the algorithm accounts for the specific connectivity architecture of qubits, mirroring the physical constraints of real quantum hardware. Experiments employed publicly available quantum computers from IBM, specifically the ibm_manila and ibm_quito systems, utilizing their documented qubit connectivity and associated gate errors. To quantify performance, the researchers utilized fidelity, calculated as the trace of the product of the density matrix of the prepared state and the target state, providing a precise measure of accuracy. This direct comparison demonstrated the effectiveness of the approach in minimizing gate count and maximizing fidelity on realistic quantum hardware, paving the way for improved performance in variational quantum algorithms. The method is designed to be deployable on any gate-based quantum computing system, offering a versatile solution for noise mitigation and performance enhancement.

Reinforcement Learning Optimizes Noisy Quantum Circuits

Scientists have achieved a breakthrough in optimizing quantum circuits for state preparation, demonstrating a reinforcement learning algorithm that significantly improves fidelity while minimizing the number of entangling gates. The work addresses a critical limitation of current quantum computing devices, which are susceptible to noise originating from entangling gates, and offers a pathway to more reliable computations on noisy intermediate-scale quantum (NISQ) systems. Researchers focused on optimizing the sequence of entangling gates within parameterized quantum circuits, allowing them to restrict the total number of gates required for a given operation while respecting the connectivity architecture of the qubits. The team developed a reinforcement learning agent capable of optimizing entangling gate sequences for a parameterized universal gate set, applying it specifically to the task of quantum state preparation.

Results demonstrate that this approach consistently reaches higher state preparation fidelities compared to hardware-efficient ansatzes, even when using the same number of CNOT gates. This improvement is particularly significant because entangling gates are the dominant source of errors in most experimental systems, and minimizing their use directly translates to increased computational reliability. The algorithm was tested and validated using publicly available quantum computers from IBM, demonstrating its adaptability to real-world hardware constraints. Experiments revealed that the reinforcement learning approach effectively navigates the complexities of qubit connectivity, tailoring the gate sequence to the specific architecture of the quantum processor.

By incorporating arbitrary parameterized single-qubit gates, the team created a gate set closely aligned with the native gate sets employed on many experimental systems, such as those developed by IBM. This alignment allows for a more effective optimization process, leading to substantial improvements in state preparation fidelity. The team quantified accuracy using fidelity, a measure of how closely the prepared state matches the target state, and demonstrated a clear advantage over traditional hardware-efficient ansatzes. This breakthrough delivers a promising new method for enhancing the performance of quantum computations on NISQ devices, paving the way for more complex and reliable quantum algorithms.

Optimised Gate Sequences Boost Quantum Fidelity

This research demonstrates a reinforcement learning algorithm capable of optimizing the sequence of entangling gates used in quantum state preparation, achieving improved fidelity compared to standard hardware-efficient approaches. By intelligently selecting gate sequences, the algorithm reduces the number of gates required while respecting the connectivity limitations of quantum hardware. The team successfully applied this method to simulations of existing IBM quantum devices, consistently reaching higher state preparation fidelities with fewer gates than traditional layered circuit designs. The findings reveal that the optimised gate sequences are particularly beneficial on devices with higher noise levels, where minimising the number of gates becomes crucial for maintaining fidelity.

Analysis across different devices, including ibm_manila and ibm_quito, shows that the algorithm adapts to varying noise characteristics, achieving peak fidelities with approximately 9 to 11 CNOT gates. The authors acknowledge discrepancies between simulated and real device performance, attributing these differences to inaccuracies in the noise parameters used in the simulations, and note that further refinement of these parameters is needed for accurate modelling. Future work will focus on addressing the limitations of the simulations by incorporating more realistic noise models and validating the algorithm’s performance on actual quantum hardware. The team also plans to investigate the impact of different qubit connectivity constraints, and explore the potential for extending this approach to more complex quantum algorithms.

👉 More information
🗞 Gate Sequence Optimization for Parameterized Quantum Circuits using Reinforcement Learning
🧠 ArXiv: https://arxiv.org/abs/2511.08096

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Drive-Jepa Achieves Multimodal Driving with Video Pretraining and Single Trajectories

Drive-Jepa Achieves Multimodal Driving with Video Pretraining and Single Trajectories

February 1, 2026
Leviathan Achieves Superior Language Model Capacity with Sub-Billion Parameters

Leviathan Achieves Superior Language Model Capacity with Sub-Billion Parameters

February 1, 2026
Geonorm Achieves Consistent Performance Gains over Existing Normalization Methods in Models

Geonorm Achieves Consistent Performance Gains over Existing Normalization Methods in Models

February 1, 2026