Associate Director for Computation
Livermores Three-Pronged Strategy for High-Performance
IN 1995, the Advanced Simulation and Computing (ASC) Program (originally
the Accelerated Strategic Computing Initiative, or ASCI) was formed
as a critical element of the Stockpile Stewardship Program. ASCs
purpose is to accelerate the development of the simulation capabilities
needed to analyze the performance, safety, and reliability of nuclear
At the beginning of the ASC
Program, we looked at the kinds of problems we would need to solve,
when we needed to be able to solve them, and how quickly we would
need to get calculation results back. This analysis determined the
size of the computers we set out to acquire through partnerships
with computer industry leaders. Our goal was to obtain a a computer
system by 2004 that could process 100 trillion floating point operations
per second (teraflops). In the past 8 years, Livermore, Los Alamos,
and Sandiathe three national laboratories involved in ASChave
fielded a number of increasingly powerful massively parallel scalable
supercomputers, that is, large numbers of processors working together
on complex calculations. ASCI Purple, arriving at Livermore next
year, will be the fulfillment of the original ASC 100-teraflops
goal. But the story does not end there.
As the supercomputers came
on board, researchers in stockpile stewardship and other programs
developed increasingly complex codes to take advantage of them.
One-dimensional codes gave way to two- and three-dimensional codes,
and some science simulations were developed based on first-principles
physics. Users needed more from the supercomputersmore capability
to run scientific calculations at large scale and more capacity
to simultaneously handle multiple calculations and diverse workloads.
As our users needs have evolved, so has our strategy as described
in the article entitled Riding the Waves of
In brief, our strategy is
to work with the U.S. supercomputer industry to pursue three technology
curves, separately and at times in tandem. The three curves are
current scalable multiprocessor technology, open-source (nonproprietary)
cluster technology, and cell-based (computer-on-a-chip) technology.
The goal is to deliver platforms best suited to the work at hand.
The supercomputing industry has chosen to pursue massively parallel
scalable architectures, and it is spending billions of dollars exploring
technologies in this arena. We work with these companies to leverage
their advances and investments, bringing more capability and capacity
cost-effectively to our users. We are constantly on the lookout
for whats next and whats after next
to meet future workloads. Our goal is to find an affordable path
to the petaflops (1,000 teraflops) level for the ASC Program by
2010beating the speed of Moores Law (capacity doubling
every 18 months) not just by a little but by a lot.
Todays ASC Program
uses proven and mature technologies, because theres no room
for risk in system functionality and reliability when simulating
on tight programmatic schedules the behavior of nuclear weapons.
To determine our next-generation supercomputer, we have been investigating
machines featuring open-source cluster technology. This technology
is the basis of our recently deployed Multiprogrammatic Capability
Resource (MCR) machine, which is being used by Laboratory researchers
who can tolerate some problems with functionality as we work out
the bugs and add features for a production environment. In preparation
for whats after that, we are researching technology thats
farther out on the horizoncell-based supercomputers. Assuming
that budgets and the technology hold, we will acquire such a cell-based
machine, BlueGene/L, in late 2004 or early 2005. BlueGene/L will
be used to improve physics models in ASC codes and to evaluate the
technology for suitability to a broader workload.
This is our strategy for
staying ahead of Moores Lawa strategy that is absolutely
crucial to the Laboratory. Here at Livermore, we have a culture
that supports and embraces simulation, which, with experimentation
and theory, forms the scientific discovery process. A successful
simulation environment requires more than a huge computer with maximum
peak speeds. It requires code development, physics models, code
validation, and an infrastructurestorage systems, visualization
capabilities, networks, compilers, debuggersall working together.
We must balance all these factors to maintain a computing environment
that has the power, capability, and capacity to serve the Laboratorys
needs in stockpile stewardship, energy and environment, biotechnology
and bioresearch, chemistry and material science, and more.
Pursuing three technology
curveswhats current, whats next, and whats
after nextand working with U.S. industry to bring the technologies
to the nations critical missions is Livermores computing
strategy. This three-pronged strategy allows us to deliver robust
production-level computing to meet todays programmatic needs
while looking ahead to provide for the demands of tomorrow.