Back

Upgrades coming to LLNL’s Corona computing cluster

(Download Image)

Penguin Computing announced that Corona, a high performance computing cluster delivered to LLNL in 2018, has been upgraded with the newest AMD Radeon Instinct MI60 accelerators, based on “Vega,” the world’s first 7-nanometer GPU architecture. The upgrade will provide significantly greater performance and bring additional capabilities in artificial intelligence and machine learning to the LLNL user community, according to Lab researchers. Photos by Garry McLeod/LLNL.

Lawrence Livermore National Laboratory (LLNL) is collaborating with Penguin Computing Inc. and graphics card manufacturer AMD to upgrade its unclassified computing cluster Corona to roughly double the amount of graphics processors (GPUs) the system previously had. The upgrade will provide significantly greater performance and bring additional capabilities in artificial intelligence and machine learning to the LLNL user community, according to Lab researchers.

Penguin Computing recently announced that Corona, a high performance computing cluster delivered to LLNL in 2018, has been upgraded with the newest AMD Radeon Instinct MI60 accelerators, based on “Vega,” the world’s first 7-nanometer GPU architecture. The upgrade is funded through the Commodity Technology Systems (CTS-1) contract with the National Nuclear Security Administration (NNSA).

Corona is being made available to industry through LLNL’s High Performance Computing Innovation Center (HPCIC). The upgrade will help LLNL researchers and their industry partners improve capabilities in scalable deep learning, big data analytics and data science, while enhancing NNSA’s ability to assess future architectures and meet the needs of NNSA’s Advanced Simulation & Computing program. It also will provide a higher level of performance for researching cognitive computing and developing predictive simulations for applications such as inertial confinement fusion and molecular dynamics simulations for precision medicine.

Lawrence Livermore National Laboratory Operations team member Raj Bagri works on Corona, a high performance computing cluster built by Penguin Computing and made available to industry through LLNL’s High Performance Computing Innovation Center.

“This upgrade significantly increases the capability available on Corona,” said Bronis R. de Supinski, chief technical officer for Livermore Computing. “The new Vega GPUs offer substantial double-precision performance, in addition to much more single-precision performance. LLNL scientists will use the combination to understand the potential of mixed-precision algorithms for a variety of domains.”

The Corona cluster consists of 170 two-socket nodes with 24-core AMD EPYC 7401 processors and a PCIe 1.6 terabyte solid-state memory device. Each Corona compute node is GPU-ready with half of the nodes utilizing four AMD Radeon Instinct MI25 accelerators per node, delivering 4.2 petaflops of FP32 peak performance. With the MI60 upgrade, the cluster increases its potential peak performance to 9.45 petaflops of FP32 peak performance. The accelerators are connected via a Mellanox HDR 200 Gigabit InfiniBand network.

“The Penguin Computing Department of Energy team continues our collaborative venture with our vendor partners AMD and Mellanox to ensure the Livermore Corona GPU enhancements expand the capabilities to continue their mission outreach within various machine learning communities,” said Ken Gudenrath, director of Federal Systems at Penguin Computing.

AMD’s Radeon Instinct MI60 accelerators utilize the company’s Infinity Fabric Link technology, a peer-to-peer GPU communications technology that delivers up to 184 gigabytes per second transfer speeds between GPUs. The new accelerators also utilize the latest ROCm open-source software stack, which is integrated into frameworks like TensorFlow and PyTorch and maps workloads to the heterogeneous compute resources of the underlying hardware.

“AMD is pleased to continue collaboration with LLNL and the NNSA in advancing open accelerator solutions. Access to systems like Corona enable next-generation scientific discovery as we move to the exascale era,” said Ogi Brkic, corporate vice president and general manager of the Data Center GPU Business Unit at AMD.

Nov. 19, 2019

Contact

Jeremy Thomas
[email protected]
(925) 422-5539

Featured Articles

LLNL team accelerates multi-physics simulations with El Capitan predecessor systems

HPC for Energy Innovation program launches Spring 2024 solicitation

WiDS Livermore conference attendees network, share research and absorb wisdom

Lawrence Livermore National Laboratory

| 7000 East Avenue • Livermore, CA 94550 | LLNL-WEB-458451

Operated by the Lawrence Livermore National Security, LLC for the Department of Energy's National Nuclear Security Administration Learn about the Department of Energy's Vulnerability Disclosure Program

Upgrades coming to LLNL’s Corona computing cluster

Contact

Related Links

Tags

Featured Articles

LLNL.GOV

ORGANIZATIONS

RESOURCES

LLNL.GOV

ORGANIZATIONS

RESOURCES

Upgrades coming to LLNL’s Corona computing cluster

Contact

Related Links

Tags

Featured Articles

LLNL.GOV

ORGANIZATIONS

RESOURCES

STAY CONNECTED