The increasing demand for computational power drives innovation in High-Performance Computing, with researchers now routinely employing accelerators like Graphics Processing Units alongside traditional processors. Giulio Malenza from the University of Torino, Giovanni Stabile from the Sant’Anna School of Advanced Studies, Filippo Spiga from NVIDIA Corporation, and colleagues demonstrate a significant step forward in harnessing this power within the widely used OpenFOAM software. They present a working application built using modern ISO C++, enabling both multi-core processing and the offloading of calculations to NVIDIA GPUs from a single codebase. This approach, combined with optimised compiler tools, successfully accelerates a key OpenFOAM application, offering a pathway to substantially improve the performance of complex simulations and broaden the accessibility of advanced computational fluid dynamics.
CFD Performance, Scalability and Turbo-machinery Applications
This collection of research focuses on improving the performance and scalability of Computational Fluid Dynamics (CFD) simulations, with applications spanning turbo-machinery and general fluid dynamics. Researchers are actively exploring methods to utilize multi-core CPUs, GPUs, and distributed memory systems to accelerate simulations and achieve better performance as problem sizes grow. A key area of investigation is memory management, with growing interest in unified memory architectures like those found in systems like the Grace Hopper Superchip, which simplify programming and can improve performance by reducing data transfer overhead. The research encompasses a range of programming models, including OpenMP, CUDA, OpenACC, and the use of standard C++ parallel algorithms and libraries like Thrust, OCCA, Alpaka, AMGX, and RapidCFD, enabling developers to efficiently parallelize code and take advantage of available hardware resources.
Studies frequently target Intel and AMD processors alongside NVIDIA GPUs, demonstrating a clear trend towards heterogeneous computing and an emphasis on portability and scalability. Researchers are actively investigating how to make CFD codes run efficiently on different architectures and seamlessly scale to utilize a large number of processors, while validation and verification remain crucial aspects of the research to ensure accuracy and reliability. This body of research demonstrates a vibrant and evolving field dedicated to pushing the boundaries of CFD simulations by harnessing the power of modern HPC architectures and programming models, aiming not only to achieve higher performance but also to ensure portability, scalability, and accuracy in complex simulations.
C++ PSTL Accelerates OpenFOAM Simulations
Researchers have developed a novel approach to accelerate complex simulations within the OpenFOAM software framework, prioritizing portability and ease of implementation alongside performance gains. Rather than creating platform-specific solutions, they leveraged the power of modern ISO C++ parallel constructs, resulting in a single codebase capable of running on both multi-core CPUs and NVIDIA GPUs. This strategy contrasts with many previous efforts that relied on specialized APIs or intrusive modifications to the OpenFOAM core, often limiting their applicability and maintainability. The core of their methodology lies in utilizing the C++ Parallel Standard Template Library (PSTL), an open standard that abstracts away the underlying parallel runtime, allowing the code to automatically adapt to different hardware configurations without manual adjustments or platform-specific code.
By adhering strictly to the C++ standard, the team aimed to create a solution that is highly portable and avoids vendor lock-in, a common challenge in high-performance computing. This approach differs from previous OpenFOAM acceleration attempts that focused heavily on optimizing linear solvers in isolation, neglecting other computationally intensive phases of the simulation. This methodology minimizes issues associated with fragmented solutions and maintenance difficulties by integrating parallel processing directly into the core simulation logic using standard C++ features, allowing for a more streamlined and maintainable codebase, facilitating future development and adaptation. The unified codebase, combined with the use of PSTL, represents a significant departure from earlier approaches and offers a promising path towards more sustainable and portable high-performance simulations.
OpenFOAM Accelerated with Standard C++ and GPUs
Modern computational simulations increasingly rely on powerful hardware accelerators, like GPUs, to tackle complex problems in engineering and scientific research. Researchers have successfully implemented a new approach using standard ISO C++ parallel programming techniques, achieving substantial performance gains by utilizing the inherent parallelism within OpenFOAM’s calculations, distributing the workload across multiple processor cores and the GPU simultaneously. By adhering to the ISO C++ standard, the resulting code is highly portable and maintainable, addressing a key limitation of previous acceleration efforts which often relied on proprietary libraries or intrusive modifications to the core OpenFOAM code, ensuring long-term viability and easing future development. This new work aims for broader acceleration across multiple computational phases, promising more substantial improvements in overall simulation speed, unlike previous attempts that focused solely on speeding up the linear solvers. Notably, the researchers achieved this acceleration without compromising code portability or maintainability, providing a foundation for a sustainable and adaptable acceleration strategy for OpenFOAM, paving the way for more efficient and powerful fluid dynamics simulations. By adhering to the ISO C++ standard and utilizing the Parallel Standard Template Library (PSTL), this work provides a foundation for a sustainable and adaptable acceleration strategy for OpenFOAM, paving the way for more efficient and powerful fluid dynamics simulations.
OpenFOAM Performance Gains with GPU Acceleration
Researchers have successfully demonstrated the feasibility of modernizing complex C++ code within the OpenFOAM framework to effectively utilize accelerator architectures like GPUs, achieving performance improvements in the laplacianFoam application and validating their approach. Results indicate significant speed-ups are possible, with the best observed reaching 11. 14x on a Grace-Hopper system, although performance is influenced by the size of the memory pool allocator and the mesh used. The study acknowledges limitations including a narrow focus on NVIDIA GPUs due to hardware availability and compiler maturity, with plans to investigate compiler frameworks like AdaptiveCpp and Roc-stdpar to assess the portability of their approach to GPUs from other vendors. They also suggest that wider adoption within OpenFOAM requires further software engineering work, including regression testing and validation, best undertaken by core developers, while utilizing libraries like AmgX could further enhance performance. The findings confirm that carefully targeted modernization of complex codes can deliver substantial performance gains, even within defined constraints.
👉 More information
🗞 Building an Accelerated OpenFOAM Proof-of-Concept Application using Modern C++
🧠 DOI: https://doi.org/10.48550/arXiv.2507.18268
