Scientists at the National Center for Computational Sciences are working to make scientific applications run more efficiently on supercomputers, with a focus on both speed and energy efficiency. Tom Beck, head of the Science Engagement section, and Trey White, a distinguished research scientist in the Algorithms and Performance Analysis group, are leading the effort. While making codes run faster has long been a goal, the team is now exploring trade-offs between speed and energy efficiency.
One area of investigation is how the operating frequency of graphics processing units (GPUs) impacts energy consumption, with potential savings of up to 25% by reducing frequency by 5-10%. The team is also exploring the use of mixed-precision arithmetic, which could offer substantial speedups and energy savings. Additionally, they are working on reducing data transfer to minimize electricity required. The goal is to accumulate small improvements to achieve significant energy efficiency gains.
Al Geist’s quote at the beginning of the article highlights the fundamental issue with supercomputers: they’re essentially giant heaters that consume massive amounts of electricity. This is a major concern, as data centers and supercomputing facilities are significant contributors to global energy consumption.
Tom Beck and Trey White are working on optimizing science applications to run more efficiently on the Oak Ridge Leadership Computing Facility’s (OLCF) supercomputers. The goal has shifted from solely focusing on speed to considering both time and energy efficiency. This is crucial, as hardware improvements are slowing down, and energy consumption must be addressed.
One area of investigation is the operating frequency of Graphics Processing Units (GPUs). By reducing the maximum frequency by 5-10%, significant energy savings can be achieved (20-25%). This trade-off between performance and energy efficiency is a key aspect of their research.
The team is also exploring the use of mixed-precision arithmetic, which involves using lower precision calculations (16 bits or fewer) for certain applications. This approach has shown promise in AI and data-science applications, offering substantial speedups and energy savings. However, it’s essential to study the impacts of deviating from full precision (64 bits) on code accuracy.
Reducing data movement is another area that can lead to increased energy efficiency. By developing software algorithms that minimize data transfer, electricity consumption can be decreased. Tom Beck suggests providing users with pie charts to visualize power usage by operation, enabling them to target potential reductions.
Beck notes that achieving significant energy efficiency gains will likely involve incremental improvements (3-5% here and there) rather than a single revolutionary breakthrough. Accumulating these small gains can lead to substantial overall improvements.
The upcoming workshop, sponsored by the Department of Energy, aims to identify priority research directions in energy-efficient computing for science. This event will inform the community about grand challenges and outline key research areas to maximize capability and energy efficiency in the next decade.
In summary, this article highlights the critical need to address energy consumption in supercomputing facilities. By optimizing science applications, exploring frequency knobs, mixed-precision arithmetic, and data transfer reduction, researchers can make significant strides towards more energy-efficient computing. The 2024 workshop will play a crucial role in shaping the research agenda for this important area.
External Link: Click Here For More
