Scientists are tackling the complex Lights Out problem, a computational puzzle implemented on both 2D grids and Mobius ladder graphs, to benchmark the capabilities of current quantum hardware. Maksims Dimitrijevs and Abuzer Yakaryilmaz, from the Center for Quantum Computer Science at the University of Latvia, working in collaboration with Maria Palchiha from Riga Purvciems Secondary School, have evaluated Grover’s search algorithm using nine and sixteen qubits on publicly available IQM devices. Their experiments demonstrate measurable improvements between the Heron r1 and Heron r2 generations of quantum processors, signifying progress in quantum computing hardware between 2023 and 2024. This research is significant because it provides a practical assessment of real quantum device performance, revealing insights into device limitations, the impact of calibration, and the variability even within devices of the same manufacturing revision, ultimately guiding the optimisation of quantum circuits and hardware development.
Can quantum computers reliably solve problems beyond the reach of conventional machines. Experiments with a puzzle-like challenge called ‘Lights Out’ demonstrate measurable progress on actual quantum hardware. These tests reveal that newer devices are not always superior and that careful calibration is essential for achieving dependable results. Scientists are increasingly turning to benchmark problems to assess the capabilities of nascent quantum computers.
Despite advances pushing qubit counts beyond 100, current hardware remains constrained by noise and limited connectivity, defining the NISQ (Noisy Intermediate-Scale Quantum) era. Now, research focuses on rigorously testing quantum algorithms on available devices, moving beyond theoretical potential to practical performance. This work employs Grover’s Search, a quantum algorithm offering a quadratic speedup for unstructured search, applied to the classic Lights Out puzzle.
Investigations utilise publicly accessible quantum hardware from both IBM and IQM, with access to approximately 10 minutes of monthly runtime from IBM and one minute from IQM, to evaluate the performance of these systems. Initial experiments centre on designing problem instances of the Lights Out game on both a two-dimensional grid and more complex Möbius ladder graphs.
These instances translate into quantum circuits requiring either 9 or 16 qubits, carefully balanced in terms of circuit depth and the number of two-qubit operations. Beyond these primary benchmarks, smaller Grover Search circuits serve as diagnostic tools to interpret results affected by hardware limitations. Circuits ran on a range of devices including IBM’s Heron r1 and r2 processors, and IQM’s Emerald, Garnet, and Sirius quantum processing units.
Observations reveal improvements in IBM hardware between the 2023 and 2024 generations, demonstrating tangible progress in quantum computer engineering. Results from IQM devices consistently produced output distributions approaching uniformity, prompting further investigation into the underlying causes of these limitations. Additional diagnostic tests, including a small Grover SAT baseline, helped researchers better understand device-specific behaviours and identify potential bottlenecks.
The study highlights a critical point: newer hardware does not automatically equate to superior performance. Significant variations observed between quantum processing units (QPUs) of the same manufacturing revision, and calibration quality emerged as a key determinant of overall device performance. These findings suggest that selecting the optimal quantum computer for a given task requires careful consideration of calibration data, rather than simply choosing the latest model. The research demonstrates that these Grover’s Search instances are well-suited for benchmarking current hardware, and the Möbius Ladder instance, in particular, may prove valuable for evaluating near-future quantum computers.
Lights Out circuits reveal generational improvements and device-to-device variability in quantum processors
Experiments utilising the Lights Out problem on both 2D grids and Möbius ladder graphs revealed performance distinctions between quantum hardware generations. Circuits employing nine and sixteen qubits implemented on devices from IBM and IQM, demonstrating improvements in hardware between the Heron r1 and Heron r2 generations during the 2023-2024 period.
On IQM devices, the Lights Out circuits yielded output distributions approximating uniformity, prompting supplementary diagnostic tests. These tests involved a small Grover SAT baseline, which showed that the IQM Garnet device exhibited greater reliability compared to other IQM devices tested within the study. Observations indicated that even devices sharing the same manufacturing revision could display considerable performance variations.
For instance, one Heron r2 device occasionally performed worse than its r1 counterpart, while another Heron r2 consistently outperformed both. Calibration quality emerged as a significant factor influencing device performance, suggesting that device selection should prioritise calibration standards. Analysis of IQM devices showed that their architecture resulted in more efficient circuits following transpilation, in contrast to those produced for IBM devices.
The 2D grid and Möbius Ladder instances, despite having similar depths and two-qubit operation counts, differed in qubit numbers, highlighting that performance isn’t solely determined by gate error rates or qubit count. Since the Möbius Ladder instance approached solvability on current hardware, it presents a promising benchmark for near-future quantum computers.
Grover’s Search instances proved suitable for benchmarking current hardware, demonstrating discernible performance differences between available quantum computers. The publicly available code, experimental results, and device calibration data, accessible at https://github.com/infenrio/lights_out_quantum, enable further investigation and reproducibility of these findings.
Lights Out and Grover’s search implementations across IBM and IQM quantum processors
Circuits implementing the Lights Out puzzle and Grover’s search executed on publicly available quantum hardware from both IBM and IQM. Initially, circuits designed to solve a 2×2 Lights Out grid prepared and tested on three IBM quantum processing units: ibm_marrakesh, ibm_fez, and ibm_torino. These circuits underwent transpilation, a process where the abstract quantum circuit is translated into the native operations of each specific quantum computer, using Qiskit’s transpiler with optimisation levels of 0, 1, 2, and 3.
A fifth execution mode involved direct circuit implementation via the IBM Quantum Composer interface, allowing for a comparison of results obtained through different methods. Then, experiments expanded to IQM devices, IQM Emerald, IQM Garnet, and IQM Sirius, again employing Qiskit and the same range of optimisation levels during transpilation.
Given near-uniform output distributions observed on IQM hardware, supplementary baseline Grover SAT circuits and a two-qubit verification circuit introduced to aid in interpreting the Lights Out results. For IBM devices, the complexity increased with the implementation of a Lights Out circuit representing a 6-lamp grid on a Möbius ladder, a non-planar graph.
Circuit optimisation proved necessary, with multiple iterations undertaken to simplify the design before execution. Once a viable circuit achieved, it launched on the three IBM devices using the Sampler method with optimisation level 3, and also with the Estimator method employing durability levels of 0, 1, and 2. Meticulous records maintained, documenting calibration data for each device, the specific circuits resulting from transpilation, and the observed outcomes of each experiment. Differences in hardware calibration can affect transpilation choices, so the team also ran 5-day consistency tests on the IBM devices.
Grover’s algorithm performance is linked to device calibration and architecture
Once a purely theoretical exercise, testing quantum algorithms on actual hardware is becoming a matter of practical refinement. Recent work using the Lights Out puzzle and Grover’s search algorithm reveals that progress is being made, but it is uneven and heavily dependent on the specifics of the machinery. Applying this search to both simple grid layouts and more complex Mobius ladder graphs, researchers demonstrated performance gains between successive generations of quantum processors.
These improvements, while welcome, are tempered by the observation that newer hardware does not automatically equate to better results. The choice of device and its calibration appear to have a disproportionate effect on outcomes. Beyond the expected challenges of maintaining quantum coherence, the study highlights a frustrating reality for those working with near-term quantum computers: variability between devices, even those built with similar technology, is considerable.
For instance, one IQM device consistently outperformed others, suggesting that manufacturing consistency remains a significant hurdle. The Lights Out problem, while seemingly trivial, serves as a useful test case because its solution space is well-defined. By comparing results on different platforms, scientists can begin to pinpoint the types of errors that are most problematic for Grover’s algorithm.
This algorithm is a building block for many other quantum applications, so understanding these limitations is vital. The path forward isn’t simply about building bigger quantum computers. Instead, attention must turn to better characterisation of existing hardware and the development of error mitigation techniques tailored to specific devices. Further research could explore how different problem encodings affect performance, or investigate whether hybrid quantum-classical algorithms can compensate for hardware deficiencies. In the end, the value of these benchmark experiments lies not in achieving a quantum speedup for Lights Out, but in building a more detailed understanding of what it will take to make quantum computation a practical reality.
👉 More information
🗞 Benchmarking the Lights Out Problem on Real Quantum Hardware
🧠 ArXiv: https://arxiv.org/abs/2602.16014
