Munich, 26 March 2024 – This March marks the end of REGALE, a European project, funded by the EuroHPC Joint Undertaking, which has carried out research into the development of new software for high performance computing (HPC) centres with a focus on energy efficiency.
After three years of research, the project now provides a toolchain that HPC centres can use to monitor and control their energy consumption and use the monitored data to improve the energy efficiency of their applications. REGALE has been coordinated by the Greek Institute of Communication and Computer Systems (ICCS) of the National Technical University of Athens (NTUA) and it comprises a consortium of 16 partner institutions and stakeholders from six European countries.
High-performance supercomputing systems play a crucial role in a number of scientific research and industrial applications. They are used for computational models in medical, climate or earthquake research, for materials science, and to build digital twins of modern systems, such as cars or windmills. Supercomputers, however, also require large amounts of power and energy to perform their computations. Improving the energy efficiency of HPC systems to reduce power consumption and energy costs has therefore become increasingly important, especially as supercomputers enter the Exascale era (i.e., operating at a billion billion floating point operations per second). To solve this pressing issue, the HPC industry has made significant investments and developments towards more energy-efficient hardware. However, this must be complemented with software efforts, which were – until recently – still lagging behind: This is where REGALE came in. The project delivered a complete software toolchain that complements and leverages existing hardware efforts and offers the needed software for coordination across threads, processes, nodes or even systems. The result is a highly capable and energy-efficient solution that can be used to operate large-scale HPC systems.
In addition to the definition of an open, modular and extensible architecture to support the energy-efficient operation of supercomputing facilities, the project instantiated this architecture based on state-of-the-art components contributed by consortium partners, and also implemented a framework to support modularity and interoperability for each tool by defining an open API. Finally, REGALE showed its work on five pilot applications from various fields and communities, thus demonstrating its broad impact. Pilot applications came from sectors like renewable energy (Industrial Scale Unsteady Adjoint-based Shape Optimization of Hydraulic Turbines), enterprise risk assessment (High-Performance Data Analytics for Enterprise Risk Assessment) or the automotive industry (Design of a car-bumper made of carbon reinforced polymers).
The open architecture of REGALE is based on established, proven software, such as open source implementations of MPI (Message Passing Interface), SLURM (Simple Linux Utility for Resource Management) and DCDB (The DataCenter DataBase), which are needed for effective resource utilisation and execution of complex applications. It builds on a set of “integration scenarios” to glue together different tools that work in concert to support energy-efficient operation at different levels of the architecture. It also implements a core infrastructure that supports modularity and interoperability, designed to integrate any component with minimal modification.