Scientists are increasingly focused on building resilient high-performance computing (HPC) systems capable of tackling complex quantum calculations. Qiang Guan from Kent State University, Qinglei Cao from Saint Louis University, and Xiaoyi Lu from University of Florida, alongside Siyuan Niu et al., present a novel architectural foundation for checkpointing and restoration in quantum HPC systems. Their research reconsiders checkpointing not as a state storage problem, but as one of control flow and state management, utilising dynamic circuit technology to enable restartable and resilient execution. This approach is particularly significant as it aligns well with iterative algorithms prevalent in fields like eigensolvers and time-stepping methods, offering a pathway towards more robust and efficient quantum simulations.
Algorithmic state capture enables resilient quantum high performance computing
Researchers have developed a novel checkpointing and restoration framework for quantum high performance computing (HPC) systems, addressing a fundamental limitation in scaling and robustness. Unlike classical HPC, quantum programs cannot be checkpointed through simple memory snapshots due to the no-cloning theorem and the collapse of quantum states upon measurement.
This work redefines checkpointing not as preserving quantum states, but as capturing and restoring algorithmic and control-flow state. The approach leverages the emerging capabilities of dynamic quantum circuits, enabling mid-circuit measurements, classical feedforward, and conditional execution to facilitate resilient quantum computation.
This innovative design allows for the correct restoration of quantum workflows following interruption or failure, aligning particularly well with iterative algorithms commonly used in quantum simulation and scientific computing. By exploiting dynamic circuits, the research captures sufficient program state through structured measurements at defined boundaries, converting quantum information into classical representations.
Restoration is then achieved by controlled re-execution of circuits, guided by the recorded classical state and parameters. The framework supports multiple checkpoint classes, including those that capture measurement outcomes, algorithmic metadata, and even error syndrome histories for fault-tolerant settings.
The proposed architecture features a quantum HPC runtime and control layer that orchestrates checkpoint creation, failure detection, and restoration, integrating seamlessly with existing HPC systems. This runtime identifies safe checkpoint boundaries based on algorithmic structure and execution progress, triggering checkpoints at iteration boundaries or circuit layers.
Quantum programs are structured into execution regions separated by these checkpoints, allowing for conditional replay of execution paths using dynamic circuit control. This method enables qubit reuse by storing final measurement outcomes and reconstructing probabilistic measurement branches, even in complex variational algorithms like Feedback-based Algorithm for Quantum Optimiza-tion (FALQON).
Quantum workflow checkpointing via mid-circuit measurement and runtime orchestration
Dynamic circuit technology underpins a novel checkpointing and restoration methodology for high-performance computing. This research redefines checkpointing not as state preservation, but as a problem of capturing control flow and algorithmic state within quantum workflows. Exploiting mid-circuit measurements, the system converts selected quantum information into classical representations at defined program boundaries, enabling the recording of sufficient program state for correct restoration following interruption or failure.
This approach aligns particularly well with iterative algorithms commonly used in simulation and scientific computing, such as eigensolvers and time-stepping methods. The study introduces a layered architecture integrating a quantum HPC runtime and control layer responsible for orchestrating checkpoint creation, failure detection, and restoration.
This runtime incorporates a checkpoint manager that identifies safe boundaries based on algorithmic structure, execution progress, and system policies, triggering checkpoints at iteration boundaries or circuit layer boundaries. Restoration is achieved by re-instantiating quantum circuits and rehydrating parameters, with dynamic circuit control conditionally replaying execution paths based on recorded classical state.
This framework supports multiple checkpoint classes, including ‘classicalized’ checkpoints that capture measurement outcomes and metadata like iteration counters, and ‘algorithmic’ checkpoints aligned with natural phase boundaries in iterative quantum algorithms. Furthermore, the design extends to ‘logical’ checkpoints for fault-tolerant settings, incorporating error syndrome histories and decoder state to support logical-level restoration. A comparative analysis, detailed in Table I, highlights the fundamental differences in checkpointing features between classical HPC and the proposed quantum-HPC system, demonstrating the incompatibility of traditional methods with quantum execution due to the no-cloning theorem and measurement-induced collapse.
Algorithmic and control-flow state capture via dynamic quantum checkpointing
This research redefines checkpointing for high-performance computing systems by focusing on capturing and restoring algorithmic and control-flow state using dynamic quantum circuits rather than preserving quantum states. The work demonstrates a feasible approach to enable restartable and resilient quantum workflows while maintaining compatibility with existing HPC runtimes and quantum mechanical constraints.
Classicalized checkpoints store final measurement outcomes before qubit reset and reuse, facilitating restart on reduced qubit layouts. In dynamic state preparation, checkpoints align with probabilistic measurement branches, storing outcomes to reconstruct the preparation path. For variational algorithms such as the Feedback-based Algorithm for Quantum Optimisation, algorithmic checkpoints capture measurement results, preserving adaptive ansatz construction.
Logical checkpoints, designed for fault-tolerant settings, store syndrome measurements and decoder states, enabling restoration of error tracking and decoding status. All checkpoint data are stored and managed within the classical HPC layer, leveraging existing checkpoint storage infrastructure. Stored data encompasses measurement outcomes, variational parameters, iteration counters, control flow decisions, random seeds, and hardware calibration metadata.
From the perspective of the HPC system, these checkpoints are structured metadata objects, allowing storage using conventional parallel file systems or burst buffers. This design preserves compatibility with existing HPC schedulers and resource managers, enabling quantum workloads to participate in standard resilience and preemption mechanisms.
The proposed architecture is feasible in the near term due to advances in dynamic quantum circuit execution and the maturity of classical HPC runtime infrastructure. Dynamic circuit capabilities, including mid-circuit measurement, conditional branching, and classical feed forward, are already supported on emerging quantum hardware platforms and exposed through compiler stacks.
The framework is particularly well-suited to iterative and staged quantum algorithms, such as variational eigensolvers and time-stepping methods, where checkpoint boundaries align with algorithmic phases. While checkpointing introduces measurement overhead and partial loss of coherence, this overhead is predictable, controllable, and amortized over long-running executions.
Algorithmic state capture enables resilient quantum computation
Checkpointing and restoration for high-performance computing are redefined through a focus on capturing algorithmic and control-flow state using dynamic quantum circuits rather than preserving quantum states directly. This approach leverages mid-circuit measurements, classical feed-forward mechanisms, and conditional execution to enable restartable and resilient quantum workflows.
The design is compatible with existing high-performance computing runtimes and quantum mechanical constraints, offering a viable path towards fault tolerance. This architecture particularly benefits iterative and staged quantum algorithms, aligning checkpoint boundaries with algorithmic phases to provide predictable overheads.
Initial implementations can utilise classical and algorithmic checkpoints on near-term hardware, with the potential for later expansion to include logical qubit checkpointing as fault-tolerant systems develop. This staged feasibility allows for meaningful validation and performance characterisation at each stage of hardware advancement.
The authors acknowledge limitations inherent in current quantum hardware, but demonstrate a practical near-term solution with a clear progression towards logical-level checkpointing in future systems. Further research may focus on optimising the performance of these dynamic circuits and exploring their application to a wider range of quantum algorithms.
👉 More information
🗞 Architectural Foundations for Checkpointing and Restoration in Quantum HPC Systems
🧠 ArXiv: https://arxiv.org/abs/2602.09325
