Multi-lab High Performance Storage System collaboration marks 30 years of data storage
After 30 years, the High Performance Storage System (HPSS) collaboration continues to lead and adapt to the needs of the time while honoring its primary mission of long-term data stewardship of the crown jewels of data for government, academic and commercial organizations around the world. Pictured are Lawrence Livermore National Laboratory HPSS team members (from left) Herb Wartens, Debbie Morford and Todd Heer in front of a Lab tape storage system.
Lawrence Livermore National Laboratory (LLNL) and the rest of the Department of Energy (DOE) national laboratories produce an astronomical amount of data every year. As the volume of data generated from DOE high performance computing (HPC) continues to reach increasing scales of magnitude and new levels of importance for decision-making, where does all this data go and how is it managed?
This year marks the 30th anniversary of the High Performance Storage System (HPSS) collaboration, comprising five DOE HPC national laboratories: LLNL, Lawrence Berkeley, Los Alamos, Oak Ridge and Sandia, along with industry partner IBM.
In 1992, the six parties created HPSS — a software-defined, scalable long-term datastore. Today, HPSS is still being developed and supported by these founding partners and is used by sites around the world, serving a total of more than 4.5 exabytes (4.5 quintillion bytes) of production data. That massive trove of data — a number some speculate could equal all the words spoken in human history — continues to accelerate at rate of more than 500 petabytes (500 quadrillion bytes) per year.
In the late 1980s, HPC leaders recognized the need for long-term archiving that was high-speed, massively scalable and would leverage distributed hierarchical storage capacity management to meet performance requirements of their supercomputers. They also were keenly aware that applying best practices in software development, management and quality from industry would be key to success.
In response, the National Storage Laboratory (NSL) was organized to investigate, demonstrate and commercialize high-performance hardware and software storage technologies focused on removing network computing bottlenecks. The HPSS collaboration grew out of the NSL’s results and experience.
A software product that thrives for three decades is exceedingly rare, particularly one that is actively developed in a collaborative manner by multiple geographically distributed organizations. When the HPSS collaboration began, the terascale era — computers capable of 1012 CPU floating point operations per second (FLOPs) — was still five years away. The HPSS architecture, implementation and focus on collaboration have allowed it to evolve to meet the demands of that era as well as the subsequent petascale era (1015 FLOPs).
DOE and industry investments in HPSS have resulted in a team spanning generations, with skills and experience in software development as well as in storage systems engineering, complex systems integration and remote deployment and support. The HPSS team has provided more than 10 major releases of HPSS — yielding remarkable operational efficiencies, performance and storage capabilities — all while achieving the greater speeds demanded by scientific, research and commercial endeavors. HPSS is poised to continue its evolution throughout the emerging exascale era (HPC machines capable of more than one quintillion FLOPs).
The HPSS collaboration’s founding members realized early on that no single organization has the necessary experience and resources to meet all the challenges represented by the growing imbalance between computing power and data storage capabilities. Designed not only to provide scalable high performance and capacity, HPSS takes advantage of hierarchies of storage technologies, including solid-state disk, magnetic disk, tape and cloud, allowing it to provide a balanced total cost of ownership for archival storage.
Additionally, needs for data accessibility and related protocols change over time; as a result, many industry products have come and gone in three decades. Meanwhile, by taking a hardware-agnostic approach from the beginning, HPSS has allowed production instances to smoothly transition across the constant network, server and storage media changes.
HPSS continues to lead and adapt to the needs of the times while honoring its primary mission: long-term data stewardship of the crown jewels of data for government, academic and commercial organizations around the world.
For more information on HPSS and the collaboration’s 30 years of history, visit the web.
Related LinksHPSS collaboration
High Performance Storage System
National Storage Laboratory
Exascale Archiving: Massive-Scale Tape Libraries
HPC, Simulation, and Data Science
HPC Innovation Center