CERN Team Develops Quantum Reinforcement Learning for Improved Particle Accelerator Efficiency

Cern Team Develops Quantum Reinforcement Learning For Improved Particle Accelerator Efficiency

A team of researchers from CERN, the University of Oviedo, and the Politehnica University of Bucharest have developed a hybrid actor-critic algorithm for quantum reinforcement learning at CERN beam lines. The research focuses on free energy-based reinforcement learning with clamped quantum Boltzmann machines, which improves learning efficiency. The approach has been extended to multi-dimensional continuous state-action space environments, particularly beneficial for particle accelerator systems. The team also developed a hybrid actor-critic scheme for continuous state-action spaces based on the Deep Deterministic Policy Gradient algorithm, combining a classical actor network with a QBM-based critic.

Quantum Reinforcement Learning at CERN

A team of researchers, including Michael Schenk, Elías F. Combarro, Michele Grossi, Verena Kain, Kevin Shing Bruce Li, Mircea-Marian Popa, and Sofia Vallecorsa, have developed a hybrid actor-critic algorithm for quantum reinforcement learning at CERN beam lines. The team is affiliated with the European Organisation for Nuclear Research (CERN), the University of Oviedo, and the Politehnica University of Bucharest. The research focuses on free energy-based reinforcement learning (FERL) with clamped quantum Boltzmann machines (QBM), which has been shown to significantly improve learning efficiency compared to classical Q-learning.

Quantum Reinforcement Learning in Real-World Applications

The FERL approach has been extended to multi-dimensional continuous state-action space environments, opening the door for a broader range of real-world applications. This is particularly important for particle accelerator systems, where the control parameters and observables are usually defined by continuous variables. The research also assesses the impact of experience replay on sample efficiency.

Hybrid Actor-Critic Scheme for Quantum Reinforcement Learning

The researchers developed a hybrid actor-critic scheme for continuous state-action spaces based on the Deep Deterministic Policy Gradient algorithm. This combines a classical actor network with a QBM-based critic. The results obtained with quantum annealing, both simulated and with D-Wave quantum annealing hardware, are discussed and the performance is compared to classical reinforcement learning methods.

Quantum Reinforcement Learning at CERN Beam Lines

The environments used throughout the research represent existing particle accelerator beam lines at CERN. The hybrid actor-critic agent is evaluated on the actual electron beam line of the Advanced Plasma Wakefield Experiment (AWAKE). This research aims to boost machine flexibility, availability, and beam reproducibility at CERN’s accelerator complex.

Conclusion and Future Directions

The research aims to remove the limitation to discrete state-action space environments and extend FERL to multi-dimensional continuous state-action spaces. This opens doors for a broader range of real-world applications, particularly for particle accelerator systems. The second objective is to compare the performance of the classical RL algorithms to their quantum or hybrid counterparts in terms of both sample efficiency and the number of parameters required to model the Q-function. The hybrid scheme is validated and compared to its classical counterpart by applying it to a ten-dimensional environment of the electron beam line at the CERN AWAKE facility, both in simulations and in the real world.

The article titled “Hybrid actor-critic algorithm for quantum reinforcement learning at CERN beam lines” was published in the Quantum Science and Technology journal on February 5, 2024. The authors of the study are Michael Schenk, Elías F. Combarro, Michele Grossi, Verena Kain, Kevin Shing Bruce Li, Mircea-Marian Popa, and S. Vallecorsa. The research focuses on the application of a hybrid actor-critic algorithm for quantum reinforcement learning at CERN beam lines. The DOI reference for the article is https://doi.org/10.1088/2058-9565/ad261b.