NVIDIA’s CUDA-Q, an open-source programming model for quantum accelerated supercomputing applications, has introduced new features that significantly improve performance. The improvements allow users to push the limits of what can be simulated on classical supercomputers. The performance of CUDA-Q was tested using 24 and 28 qubit Variational Quantum Eigensolver (VQE) problems. The latest version of the software development kit (SDK), version 0.7.1, offers significant performance improvements over previous versions. For example, on a 24-qubit system, the latest version of the SDK is up to 7 times faster than the previous version.
NVIDIA CUDA-Q: Enhancing Quantum Application Performance
NVIDIA CUDA-Q, previously known as NVIDIA CUDA Quantum, is an open-source programming model designed to facilitate the development of quantum-accelerated supercomputing applications. These applications leverage the computational capabilities of CPUs, GPUs, and Quantum Processing Units (QPUs). However, creating such applications is a complex task that requires a user-friendly coding environment and robust quantum simulation capabilities to efficiently evaluate and improve the performance of new algorithms.
CUDA-Q has introduced several new features that significantly enhance performance, allowing users to push the boundaries of what can be simulated on classical supercomputers. This article delves into the performance enhancement of CUDA-Q for quantum simulation and provides a brief explanation of the improvements.
Performance Improvement in CUDA-Q
The primary quantum task in a Variational Quantum Eigensolver (VQE) application is computing expectation values. CUDA-Q simplifies this process with the observe function. The performance of the three most recent CUDA-Q releases was tested using 24 and 28 qubit VQE problems aimed at determining the ground state energy of two small molecules (C2H2 and C2H4). The experiments used the standard UCCSD ansatz and were written in Python.
Three state vector simulator backends were tested for each version (v0.6, v0.7, v0.7.1): nvidia (single precision), nvidia-fp64 (double precision), and nvidia-mgpu (nvidia-fp64 with gate fusion). Gate fusion is an optimization technique where consecutive quantum gates are combined or merged into a single gate to reduce the overall computational cost and improve circuit efficiency. The number of gates combined (gate fusion level) can significantly affect simulation performance and needs to be optimized for every application.
Performance Comparison of Different CUDA-Q Versions
The performance of different versions of the NVIDIA CUDA-Q software development kit (SDK) for quantum computing was compared. The SDK is used to develop and run quantum computing applications on NVIDIA GPUs. The latest version of the SDK, version 0.7.1, offers significant performance improvements over previous versions. For example, on a 24-qubit system, the latest version of the SDK is up to 7 times faster than the previous version. On a 28-qubit system, the latest version of the SDK is up to 4.7 times faster than the previous version. These performance improvements are due to a number of factors, including improvements to the compiler, the runtime system, and the libraries.
Accelerating Code with CUDA-Q
CUDA-Q v0.7 includes several enhancements that improve compilation and accelerate the time required to make successive observe calls. First, the just-in-time (JIT) compilation path was improved to more efficiently compile the kernel. Previously, this procedure scaled quadratically with the number of gates in the circuit, but was reduced to linear scaling.
Second, improvements to the hashing for JIT change-detection checks reduce the time required to check if any code needs to be recompiled due to environment changes. This virtually eliminates the time required for these checks for each observe call.
Finally, v0.6 would perform all log processing for every call, regardless of the specified log level. This was changed in v0.7 to only perform the necessary processing for the specified log level. In addition to gate fusion, 0.7.1 introduced automatic Hamiltonian batching which further reduces the runtime for observe calls, by enabling batched Hamiltonian evaluations on a single GPU.
Future Enhancements and Getting Started with CUDA-Q
Future releases of CUDA-Q will include more enhancements to state preparation, handling of Pauli operators, and unitary synthesis. The current and anticipated CUDA-Q improvements provide developers with a more performant platform to build quantum accelerated supercomputing applications. Not only is development today accelerated, but applications constructed on CUDA-Q are positioned to deploy in hybrid CPU, GPU, and QPU environments necessary for practical quantum computing.
The CUDA-Q Quick Start guide will help you to quickly set up your environment, while the Basics section will guide you through writing your first CUDA-Q application. Explore the code examples and applications to get inspiration for your own quantum application development. To provide feedback and suggestions, visit the NVIDIA/cuda-quantum GitHub repo.
External Link: Click Here For More
