CUDA Quantum Unveils Boosted Quantum Supercomputing Capabilities, Promises 4x Speedup

Cuda Quantum Unveils Boosted Quantum Supercomputing Capabilities, Promises 4X Speedup

CUDA Quantum, an open-source programming model, has introduced new capabilities for quantum-accelerated supercomputing. The model allows quantum computing workloads to run on different computing architectures, such as quantum processing units (QPUs), GPUs, and CPUs. The new feature enables programming multi-QPU platforms along with multiple GPUs. CUDA Quantum uses Message Passing Interface (MPI), a communication protocol for parallel programming, to accelerate its processes. NVIDIA’s Hopper H100 GPUs, which offer 80 GB of memory each, are used for exact state vector simulation. The software also allows for parallelization with multi-QPUs, reducing runtime significantly.

CUDA Quantum: Enhancing Quantum Accelerated Supercomputing

CUDA Quantum, an open-source programming model, is designed to build quantum-classical applications. These applications are expected to run on heterogeneous computing architectures such as quantum processing units (QPUs), GPUs, and CPUs in tandem to solve real-world problems. CUDA Quantum provides the tools to program these computing architectures harmoniously, accelerating these applications.

Multiple Gpus Mode Is One Endpoint With ‌Extended Memory For Circuit Simulation
Multiple GPUs mode is one endpoint with ‌extended memory for circuit simulation

Scaling Quantum Applications with Multiple QPUs and GPUs

The ability to target multiple QPUs and GPUs is crucial for scaling quantum applications. Distributing workloads over multiple compute endpoints can achieve significant speedups when these workloads can be parallelized. A new feature in CUDA Quantum enables programming multi-QPU platforms and multiple GPUs seamlessly.

Much of the acceleration in CUDA Quantum is achieved using the Message Passing Interface (MPI), a communication protocol used for parallel programming. It is particularly useful for solving problems that require large amounts of computation, such as weather forecasting and fluid and molecular dynamics simulations. Now, CUDA Quantum can be integrated with any MPI implementation using an MPI plugin, allowing customers to easily use CUDA Quantum with the MPI setup they already have.

By Combining The Multiple Gpu And Multiple Qpu Approaches, The Problem Can Be Run In Parallel To Achieve A 2X Speedup, While Each Endpoint Has An Extended Memory Of 160 Gb
By combining the multiple GPU and multiple QPU approaches, the problem can be run in parallel to achieve a 2x speedup, while each endpoint has an extended memory of 160 GB

Circuit Simulation Scaling with Multi-GPUs

Circuit simulation for n qubits is limited by the memory required to store the 2^n dimensional state vector. NVIDIA Hopper H100 GPUs offer 80 GB of memory each for the CUDA Quantum targets to perform exact state vector simulation beyond the limits of what is feasible with current QPU hardware.

The nvidia-mgpu target pools the memory of multiple GPUs in a node and multiple nodes in a cluster to enable scaling and remove the single GPU memory bottleneck. This software tool in CUDA Quantum works the same for a single node as it does for tens or even thousands of nodes. Qubit counts are only limited by the GPU resources available.

Parallelization with Multi-QPUs

The multi-QPU mode enables programming future workflows where parallelization can reduce the runtime by a factor of the compute resources available. A circuit-cutting protocol where one cut requires running multiple subcircuits, the results of which are stitched back together in post-processing, can be executed in parallel, drastically reducing runtime.

Another common workflow that’s “embarrassingly parallel” is the computation of the expectation value of a Hamiltonian with many terms. With multi-QPU mode, multiple endpoints can be defined, where each endpoint can simulate an independent part of the problem.

Combining Multi-QPU and Multi-GPU Workloads

With CUDA Quantum 0.6, it is now possible to combine the scale of quantum circuit simulation with the nvidia-gpu target and parallelization with the nvidia-mqpu target, enabling large-scale simulations to run in parallel.

Developers can now experiment to find the balance between the problem size and the number of parallel endpoints, utilizing GPUs to their maximum potential.

MPI Plugin Integration

CUDA Quantum uses MPI (Message Passing Interface) and is built with Open MPI, an open-source implementation of the MPI protocol. Now, it’s easier than ever to integrate CUDA Quantum with any MPI implementation through an MPI plugin interface. A one-time activation script is needed to create a dynamic library for the implementation. CUDA Quantum 0.6 includes plugin implementations that are compatible with Open MPI and MPICH. Contributions for additional MPI implementations are welcome and should be easy to add.

To learn more about CUDA Quantum, visit NVIDIA/cuda-quantum to view the complete CUDA Quantum 0.6 release log. The CUDA Quantum Getting Started guide walks you through the setup steps with Python and C++ examples. For advanced use cases for quantum-classical applications, see the CUDA Quantum Tutorials. Finally, explore the code, report issues, and make feature suggestions in the CUDA Quantum open-source repository.