Flamenco System Enables Low-Latency Multiprogramming Quantum Computing for Scalable Devices

The challenge of efficiently running multiple quantum programs simultaneously is becoming increasingly critical as quantum computers grow in complexity. Yilun Zhao, Yu Chen, and Kaiyan Chang, from the Institute of Computing Technology, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences, alongside colleagues Li, Li, and Han, have addressed this issue with a novel system architecture. Their research introduces FLAMENCO, a compilation system designed to enable low-latency multiprogramming without the need for time-consuming online compilation. This innovation is significant because it allows for independent, offline compilation of quantum programs, binding them to specific qubit regions and facilitating dynamic selection during runtime. By streamlining the execution process and mitigating qubit interference, FLAMENCO demonstrably improves both speed and reliability, paving the way for practical applications like repeatedly invoked Quantum Network services.

The research focuses on mitigating the significant delays currently experienced when switching between quantum programs, a critical bottleneck for practical quantum computation. Their proposed architecture integrates a novel quantum control plane with a hardware-aware scheduler, aiming to reduce context switching overhead to under 100 microseconds, achieved through optimised qubit mapping, pre-compilation of frequently used quantum kernels, and a dedicated hardware interface for rapid control signal generation. The system employs a multi-level scheduling strategy, prioritising tasks based on urgency and resource requirements, and dynamically allocating qubits to minimise data transfer times.

A key component is the ‘quantum virtual machine’ which abstracts the underlying hardware complexities and provides a consistent programming interface. Simulations demonstrate that this architecture can support concurrent execution of multiple quantum programs with a latency comparable to single program execution, representing a substantial improvement over existing approaches, with performance evaluations indicating a potential speedup of up to 3x for multiprogrammed workloads. Furthermore, the authors detail the design of a custom hardware accelerator, incorporating field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs), to facilitate fast control signal generation and qubit readout. This hardware co-design is crucial for achieving the targeted latency reduction and scalability, while the architecture also incorporates error mitigation techniques to improve the reliability of multiprogrammed execution, addressing the inherent sensitivity of quantum systems to noise. Through comprehensive modelling and simulation, the researchers demonstrate the feasibility and potential benefits of their proposed system for advancing the field of quantum computing.

Fidelity-Aware Compilation for Multiprogramming Quantum Computers To address

To address the challenges of scaling multiprogramming quantum computing (MPQC), scientists developed FLAMENCO, a fidelity-aware multi-version compilation system designed to enable independent offline compilation and low-latency execution of multiple quantum programs. The research tackled the bottleneck of online compilation, which currently dominates runtime and hinders real-world applications like repeatedly invoked Quantum Network (QNN) services. FLAMENCO fundamentally abstracts quantum devices into compute units, significantly reducing the complexity of region allocation and enabling a streamlined approach to resource management. The core of FLAMENCO lies in its multi-version compilation strategy, where the system transforms each quantum program into a diverse set of executables, each tailored to a specific qubit region.

This innovative technique circumvents the limitations of non-portable quantum executables by allowing dynamic qubit resource selection at runtime. Crucially, the compiler evaluates the fidelity of these executables, providing metrics that guide runtime orchestration towards high-fidelity results, integral to mitigating errors and maximizing the reliability of co-executed programs. At runtime, a streamlined orchestrator leverages these post-compilation fidelity metrics to avoid conflicts and minimise crosstalk between concurrently running programs. The team designed a heuristic method for fidelity-aware orchestration, effectively mitigating fidelity loss and surpassing the performance of existing online compilation schemes.

Comprehensive evaluations, using both noisy simulators and real-world quantum machines, demonstrated a runtime speedup exceeding 5x and improvements in execution fidelity of over 10%. Further investigation involved ablation studies to validate the effectiveness of the fidelity-aware strategy, alongside scalability evaluations under varying orchestration strategies. Scientists also performed sensitivity analyses, examining the impact of compute unit size and assessing system robustness to hardware parameter variations and crosstalk, revealing insights for enhancing scalability and ensuring the reliable operation of FLAMENCO in complex quantum computing environments.

FLAMENCO Boosts Quantum Multiprogramming Performance Fivefold

Scientists have achieved a breakthrough in multiprogramming quantum computing with the development of FLAMENCO, a fidelity-aware multi-version compilation system. This work addresses the critical need for improved device utilization and throughput as quantum systems grow in complexity. FLAMENCO abstracts quantum devices into compute units, significantly reducing the complexity of region allocation and enabling independent, offline compilation of programs, resulting in a greater than 5x runtime speedup compared to state-of-the-art MPQC baselines. FLAMENCO generates diverse executable versions for each program, each tailored to a specific qubit region, allowing for dynamic region selection during runtime and overcoming the limitations of non-portable quantum executables.

At runtime, a streamlined orchestrator leverages post-compilation fidelity metrics to proactively avoid conflicts and mitigate crosstalk between programs, achieving reliable co-execution without the need for computationally expensive online co-optimization, and demonstrating a fidelity improvement exceeding 10%. The research team systematically evaluated FLAMENCO’s performance using both noisy simulators and real-world quantum machines, focusing on latency, fidelity, and utilization metrics. Measurements confirm that the system maintains high utilization even as concurrency increases, paving the way for more efficient use of quantum resources. Ablation studies validated the effectiveness of the fidelity-aware strategy, while sensitivity analyses explored the impact of compute unit size and hardware parameter variations. Further investigation into system robustness revealed FLAMENCO’s resilience to crosstalk, a significant source of error in quantum systems. This breakthrough delivers an architecture for future low-latency MPQC, enabling the repeated invocation of Quantum Network (QNN) services and other practical, real-world workloads, representing a significant step towards scalable and efficient quantum computing.

FLAMENCO unlocks fast, fidelity-aware quantum co-execution

FLAMENCO represents a significant advance in multiprogramming quantum computing by decoupling compilation from execution. This system achieves low-latency, high-fidelity co-execution through offline, fidelity-aware compilation that generates multiple executable versions for each program, bound to distinct qubit regions. By abstracting the device into compute units, FLAMENCO drastically reduces the complexity of region allocation and enables dynamic selection at runtime, overcoming limitations of quantum program portability. Evaluations demonstrate that FLAMENCO eliminates the overhead associated with online compilation, resulting in over a fivefold speedup in runtime and improved execution fidelity, even as concurrency increases. The research also highlights a trade-off between utilization and fidelity, demonstrating that prioritizing high utilization can negatively impact performance due to inter-program crosstalk, a factor FLAMENCO actively mitigates. Future work could explore integrating FLAMENCO’s orchestration strategy with existing crosstalk metrics to further refine executable selection and enhance performance.

👉 More information
🗞 A System Architecture for Low Latency Multiprogramming Quantum Computing
🧠 ArXiv: https://arxiv.org/abs/2601.01158

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Large Language Models’ Political Leaning Assessed Using 10,584 Parliamentary Records

Deconfined Quantum Critical Points Enable Quasi-Long-Range Order and Extraordinary Boundary Correlations

January 15, 2026
Large Language Models’ Political Leaning Assessed Using 10,584 Parliamentary Records

Large Language Models’ Political Leaning Assessed Using 10,584 Parliamentary Records

January 15, 2026
Quantum Error Correction Achieves 97.8% Fidelity with Advanced Syndrome Extraction

Quantum Error Correction Achieves 97.8% Fidelity with Advanced Syndrome Extraction

January 15, 2026