Over the course of its seven-year duration, the Department of Energy’s (DOE’s) Exascale Computing Project (ECP) has developed a capable, high-performance computing (HPC) ecosystem, bringing together mission-critical applications, an integrated software stack, and hardware technology advances to make manifest and optimize the latest, most powerful supercomputers on Earth. Frontier, DOE’s first exascale system to come online, has already delivered unprecedented results.
Using ECP-enabled software, researchers can run jobs on Frontier that simulate physical processes at extreme scales and complexity, thus allowing observations of phenomena that were inaccessible with less-powerful machines.
For industry partners, and ultimately for an entire HPC community at the base of U.S. technology competitiveness, exascale brings opportunity to investigate scientific and technical challenges that were too computationally intensive and costly to pursue previously, providing an avenue to advance engineering well beyond the state of the art.
At SC23, held in Denver, Colorado, last November, members of ECP’s Industry and Agency Council, comprised of U.S. business executives, government agencies, and independent software vendors, reflected on how ECP and the move to exascale is impacting current and planned use of HPC to accelerate component design and manufacturing, boost competitiveness, and build global technology leadership.
Digital Twins and Multidisciplinary Simulation
Pete Bradley, principal fellow, Digital Tools and Data Science at Pratt & Whitney, highlighted the utility of exascale for simulating intricate combustion physics related to current and next-generation gas turbine engines. “We are undergoing a digital transformation,” said Bradley. “Pratt & Whitney is building a model-based enterprise that connects every aspect of our product lifecycle from customer requirements to preliminary design to detailed designs, manufacturing, and sustainment. Advanced modeling helps us to build digital twins that allow us to optimize products virtually before we bend a single piece of metal. This will deliver new advances for our customers in performance, fuel burn, and emissions, and shorten the time it takes to go from concept to production. Doing this work requires authoritative models at multiple levels of fidelity, from global to nanoscale. Exascale computing can allow us to understand these phenomena that were previously out of reach, and we can then integrate those models to deliver capabilities beyond anything the world has seen.”
Bradley points out that multidisciplinary simulations offer a way to visualize various physical processes simultaneously, fluid–structure interactions, for example, which play a critical role in a component’s performance and service life. The complexity of these design spaces requires utilizing the tools of today and inventing those of tomorrow. Bradley discussed how machine learning combined with artificial intelligence could provide a mechanism for “physics-informed” models rather than just experience-based ones to accelerate future component design and product commercialization. He said, “There’ s a lot of runway ahead,” in terms of supercomputing capability, which has exciting implications for industry.
GPUs and Software Validation
At oil and gas company ExxonMobil, subsurface imaging drives a large part of its upstream business, enabling scientists to visually separate different properties of rock within the Earth. With more computing capability, the detail of those images has increased dramatically over the last 30 years.
“Compute really drives two big pieces,” said Mike Townsley, senior principal for HPC at ExxonMobil. “It’s better images and faster time to solution, and we kind of trade those off depending on how important a project is or how vital a very crisp image in an area actually is.” Back in 2021–2022, the company conducted a multidisciplinary, in-depth study to determine what HPC capability would be needed to double image throughput while navigating power and space constraints. Townsley said, “We figured out […] that there was no way we were going to be able to do this without GPUs.” The decision to move to GPUs; however, was tempered by the dollars per throughput cost of a system and the time and expense associated with moving code used for decades to GPU architectures.
Townsley highlighted the leadership role ECP played in accepting GPU devices at ExxonMobil, “[The] ECP effort proved GPU accelerators (and their software ecosystems) were productive across a wide variety of science, de-risking our decision.” According to Townsley, ECP led the way in demonstrating the feasibility of GPU-based architectures for visualizing different physics problems, and this work gave the company confidence, knowing that the tools and the capabilities would be available when it started porting its code. He attributed much of this uplift to ECP’s support and investment for key tools and libraries such as SPACK, OpenMPI, HDF5, and zfp. He also acknowledged ECP portability software suites, Kokkos and Raja, for developing future platform-agnostic tools. Ultimately, “better images lead to better outcomes,” notes Townsley, and ECP has provided a path forward for making more detailed simulations of the subsurface a reality and at much higher throughput.
Key Benefits Beyond Exascale
For GE Aerospace Research, exascale computers at DOE are becoming a useful tool for virtually testing new engine designs to achieve greater fuel efficiency and reduce CO2 emissions. “We have a history of working with the [national] labs in propulsive turbomachinery,” said Rick Arthur, the company’s senior principal engineer for Computational Methods. Historically, jet engine simulations were limited to what was pragmatically computable. Arthur notes that as the HPC community has advanced toward exascale and beyond, these simulations have extended past single blades to entire rows, stages, and even toward entire systems. Arthur explains that Frontier has provided a platform to evaluate aerodynamic and acoustic aspects of a novel, open-fan engine design at full-scale and under realistic flight conditions, a feat that was impossible without the computational power of exascale.
However, Arthur points out that ECP offers a key benefit aside from access to exascale systems. “The pursuit of exascale has been as important as exascale itself, in that it provides an abundance of petascale,” he said. “Problems requiring the full machine size are rare (and currently prohibitively costly to create, verify, and interpret) but that abundance affords feasibility to run exploratory ensembles more freely at petascale, which is much more approachable for many in industry.” In addition, Arthur notes that the relationships, technology advancement, and the learnings from involvement with ECP has helped inform the company’s internal HPC investments and readiness. When it comes to expanding the capabilities of what can be done computationally, Arthur states, “DOE has been the leader, and we are grateful to follow.”
Download, Modify, Enable
TAE Technologies, Inc., is a private fusion company working towards building a viable, commercial fusion power plant as a source of abundant clean energy. Over its 25-year history, the company has developed progressively more advanced, bigger machines, and the science involved has become ever-more complicated. “As the machines get bigger and the physics issues that we face get more complex, so does the modeling software that we need and the HPC capabilities that we need to address these problems,” said TAE computational scientist Roelof Groenewald. As the machines get larger, more computational power is needed to model the full reactor size. During the panel discussion, Groenewald highlighted the benefits of modular software design as exemplified in ECP’s WarpX public code and its use in fusion reactor design. “The care that the WarpX developers took to modularize the code allows us to build a very different model to simulate next-generation fusion experiments. Because WarpX scales well, our simulations built on WarpX also scale well.”
WarpX was initially designed to simulate next-generation particle accelerators through laser-plasma wakefield acceleration, but the time and spatial scales of interest are different from what the fusion community needs. “One of the benefits to us, which is true of a lot of ECP products, is that the code is completely open source. Code is easy to get started with, free to download, and it’s free to modify, which is a critical thing for our use case,” said Groenewald. “Developers took the time to develop WarpX in an extremely modular way, where the different physics kernels are essentially independent from each other and can be moved around and arranged so we could very quickly build a different model out of the pieces to simulate the specific length and timescales we are interested in.” Groenewald echoes the sentiments that the plethora of petascale aspect of exascale provides the ability to run ensembles of jobs to explore many aspects of design. Using the Perlmutter supercomputer at Lawrence Berkeley National Laboratory, the company ran its full code in significantly less time—1 day instead of 10—illustrating how ECP products help greatly improve iteration speed on designs by accelerating time to solution.
Summary
By stretching the limits of what is possible computationally and enabling people to do more, industry leaders highlight the value of ECP software and GPU acceleration to their work. Utilizing GPUs and future accelerators are key to performance and energy efficiency at all levels of computing systems—from deskside, to a rack system, local HPC center, or exascale. ECP brought together scientists, computational experts, vendors, and industry in the pursuit of making exascale a reality, and in doing so, demonstrated that progress can be made better together. Looking back on what has been done, what has been created, and what exascale will do, the future is promising. Fran Hill, Chief Scientist for the Department of Defense HPC Modernization Program, acknowledged that the work done through ECP is just the beginning of great things to come. “We cannot look at this as the end of ECP, but rather, the beginning of the exascale era.”
The progress discussed in this article was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative.