Accurate and efficient modelling of coastal ocean circulation, particularly the prediction of storm surge, demands increasingly powerful computational resources. Chayanon Wichitrnithed, Eirik Valseth, and Clint Dawson, from institutions including The University of Texas at Austin and The Norwegian University of Life Sciences, address this challenge by presenting a significantly accelerated version of the Discontinuous Galerkin Shallow Water Equations solver, known as DG-SWEM. The team successfully ports this solver to modern Graphics Processing Units (GPUs) using both CUDA Fortran and OpenACC, unlocking substantial data parallelism inherent in the method. This work demonstrates a considerable performance boost when running realistic simulations, and importantly, explores strategies for maintaining code clarity and ease of future development within a single codebase, paving the way for more detailed and timely predictions of coastal flooding events.
GPU Acceleration of Compound Flood Simulations
Researchers evaluated and optimized a Discontinuous Galerkin (DG) finite element model implemented on NVIDIA GPUs to simulate compound flooding, the interaction between storm surge and riverine flow. The team focused on optimizing the code for the NVIDIA Grace Hopper (GH) architecture and presented performance results for scenarios including the Neches River and Hurricane Harvey. Analysis identified key performance bottlenecks using hierarchical roofline analysis, demonstrating the potential for substantial speedups over traditional CPU-based simulations. The simulation utilizes a DG finite element model, well-suited for parallelization and complex geometries, to accurately represent the interaction between storm surge and riverine flow.
The core of the simulation involves volume integration, interior edge integration, and flux gathering, with performance evaluations demonstrating significant speedups for large-scale simulations. Profiling with hierarchical roofline analysis revealed that memory bandwidth often limits performance. The code demonstrates good scalability on multi-GPU systems, and detailed performance data shows the time contribution of each subroutine. Recognizing limitations in existing models when handling complex scenarios like compound flooding, the team leveraged the inherent parallelism within the DG-SWEM formulation, which approximates solutions using polynomial basis functions guaranteeing local conservation of mass and momentum. The localized nature of DG-SWEM’s computations lends itself well to parallel processing, enabling significant speedups on GPU architectures when compared to a single CPU node with 144 cores. The solver, designed to improve computational efficiency, naturally maps to the parallel architecture of GPUs due to the inherent data parallelism within the discontinuous Galerkin method, allowing for independent calculations on each element. The study focused on achieving performance gains while maintaining code maintainability, particularly with the OpenACC implementation which leverages Unified Memory for streamlined data transfer. Tests conducted on the Grace Hopper chip revealed substantial performance improvements when comparing GPU execution to a single CPU node with 144 cores, addressing limitations found in existing models like ADCIRC. Results demonstrate that the discontinuous Galerkin method employed in DG-SWEM guarantees local conservation of mass, a critical factor for accurate flood prediction, while also overcoming the numerical instabilities that plague continuous Galerkin formulations. The team achieved this through two distinct approaches, CUDA Fortran and OpenACC, both leveraging the inherent parallelism within the solver’s structure, and significantly accelerating computations when compared to a single CPU node. The study validates the GPU acceleration using benchmark tests and large-scale hurricane simulations, including scenarios with compound flooding effects. Researchers acknowledge that the performance gains are dependent on the specific hardware configuration and problem size, and further optimisation may be possible. This work provides a foundation for developing more efficient and accurate coastal flood forecasting systems, crucial for mitigating the impacts of increasingly severe weather events.
👉 More information
🗞 GPU-acceleration of the Discontinuous Galerkin Shallow Water Equations Solver (DG-SWEM) using CUDA and OpenACC
🧠 ArXiv: https://arxiv.org/abs/2508.21208
