Penguin/Intel 'Magma' computing cluster coming to LLNL

penguin computing (Download Image)

Penguin Computing, along with its partners Intel and CoolIT, has shipped a LLNL's Linux-based cluster, known as “Magma,” to the Laboratory.

Lawrence Livermore National Laboratory (LLNL) is welcoming the newest addition to its already powerful supercomputing lineup, a commodity cluster system built by Penguin Computing Inc. that will perform vital calculations for the National Nuclear Security Administration (NNSA).

Penguin Computing, along with its partners Intel and CoolIT, announced it has shipped LLNL’s latest Linux-based cluster, known as “Magma,” to the Laboratory. Procured through the Commodity Technology Systems (CTS-1) contract with the NNSA, Magma is one of the first deployments of Intel’s Xeon “Cascade Lake” Platinum 9200 series processors, which are specifically designed for high-performance computing machines. The system is supported by CoolIT Systems’ complete direct liquid cooling solution and Omni-Path interconnect. On the latest TOP500 List of the world’s most powerful supercomputers released on Nov. 18, Magma ranked at No. 69, with 3.24 petaflops of maximum sustained performance. The cluster’s theoretical peak is 5.313 petaflops.

Funded through NNSA’s Advanced Simulation & Computing (ASC) program, Magma will support mission simulations critical to ensuring the safety, security and reliability of the nation’s nuclear weapons in the absence of underground testing.

“The Commodity Technology System efforts at NNSA represent a very cost-effective way to manage our workload at each of our three laboratories,” said Mark Anderson, director for NNSA’s ASC Office and Institutional Research and Development Programs. “In this model, commodity-based systems take on the bulk of day-to-day computing, leaving the larger advanced technology capability systems available for only the most demanding problems across the Tri-Lab community. This is just an example of the sophisticated approach NNSA is taking to manage demanding workloads in the most efficient manner for the country.”

“Magma represents a timely addition to our CTS machines in order to address the significant surge in demand coming from NNSA’s major Life Extension Program,” said Michel McCoy, LLNL’s Advanced Simulation & Computing program director. “It is essential to have available a supply chain that can respond instantly, delivering state-of-the-art technology in just a few months to meet pressing national security needs. We look forward to moving this system into production as fast as possible.”

Magma consists of 752 compute nodes, with each node each configured with dual Xeon Platinum 9242 processors — for a total of more than 73,000 cores. Its total memory capacity is 293 terabytes, with a total memory bandwidth of 430 terabytes per second. The cluster utilizes Penguin’s Relion XE2142eAP compute servers. CoolIT Systems is providing liquid cooling for Magma through a blind-mate, coldplate loop design, allowing the servers to operate at maximum efficiency.

“The convergence of HPC and AI is here today. At Penguin Computing, we are excited to deliver Magma, an HPC system that is enhanced by artificial intelligence technology,” said William Wu, vice president of Hardware Products at Penguin Computing. “We are seeing artificial intelligence permeate every industry and, specifically in HPC, today we can deliver a converged platform that allows AI to accelerate HPC modeling for our data scientist customers.”

“We continue designing new, leading edge solutions with our partners for the DOE NNSA’s CTS-1 contract. Magma is another example of a great shared effort resulting in an HPC cluster designed and built to meet new demanding workloads,” said Ken Gudenrath, DOE director at Penguin Computing.

“Magma is a major leap forward in HPC and AI convergence that could only be achieved with trusted engineering collaboration among Lawrence Livermore National Lab, Penguin Computing and Intel,” said Phil Harris, vice president and general manager of Intel’s Datacenter Solutions Group. “With up to 96 cores per node, massive memory bandwidth and integrated AI acceleration with Intel DL Boost technology, the Intel Xeon Platinum 9200 processor will provide a powerful foundation for LLNL to enhance its ability to meet its mission goals.”

Under the CTS-1 contract, Penguin has delivered more than 22 petaflops of computing capability to support the ASC program at the NNSA Tri-Labs — the Lawrence Livermore, Los Alamos and Sandia national laboratories.