Sept. 14, 2021
Previous Next

DOE-funded project to advance portability of heterogenous HPC applications

Jeremy Thomas, thomas244 [at] llnl.gov, 925-422-5539

A project involving researchers at Lawrence Livermore National Laboratory (LLNL) and national lab and academic collaborators has received U.S. Department of Energy (DOE) funding as part of an effort to adapt scientific software for next-generation high-performance computing (HPC) systems.

The project, “ComPort: Rigorous Testing Methods to Safeguard Software Porting,” will address one of the major challenges for scientific computing — the numerical aspects of porting scientific applications to different HPC platforms. The need for solutions is becoming more urgent as supercomputers increasingly integrate various combinations of central and graphics processing units (CPUs and GPUs), accelerators and software, according to researchers.

The rising heterogeneity of HPC hardware can affect the numerical integrity of codes, meaning that software that runs on one system may not produce the same results on a different one, explained ComPort project co-principal investigator and LLNL computer scientist Ignacio Laguna. The challenge will become even more complex as DOE enters the exascale era, where computers will exceed one quintillion calculations per second, Laguna added.

“The ComPort software tools will allow large LLNL’s scientific simulations to produce numerical computations that are more robust and reproducible in next-generation supercomputers,” Laguna said.

The project will compare validated results from previous versions and rigorously test the numerical behavior of hardware and software to verify whether computational results agree with expected answers, according to Laguna. Additionally, automated ComPort software tools will provide high-level user feedback to diagnose and repair software applications to maintain correctness despite changing hardware and compilers. 

A number of tools developed as part of the Lab’s Advanced Technology Development and Mitigation Next Generation Computing Enablement project, such as FLiT and FPChecker, were essential in the award’s selection, Laguna said.

The project is a collaboration among LLNL, the University of Utah (lead investigator Ganesh Gopalakrishnan and Pavel Panchekha), Pacific Northwest National Laboratory (Ang Li), the University of California, Davis (Cindy Rubio-Gonzalez) and the University of Washington (Zachary Tatlock). 

The recent DOE awards, totaling more than $13 million, are managed through DOE’s Office of Science, Advanced Scientific Computing Research (ASCR) program.

For a full list of funded projects, click here.