|
Until the mid 1990's,
supercomputing, high performance computing, or high end computing
was reserved to an elite few who had access to systems costing millions
(even tens of millions) of dollars. Large vector supercomputers
and massively parallel processing systems (MPP) could deliver in
the range of 1 to 50 Gflops.
But because these systems
were usually shared among a community of users, the actual capability
provided was often a small part of the peak capability of the system.
For example, the majority of users of a C-90 capable of 16 Gflops
peak performance would run on a single "head" of only 1 Gflops peak
performance while other users space shared the other processors
of the system at the same time.
Many large science and
engineering applications were either not run or were greatly simplified
in fidelity and size to run for shorter time on smaller parts of
such big machines, or worse yet, on scientific workstations. But
many other potential applications were never executed at all for
want of adequate computing resources due to over subscription of
existing shared supercomputing centers.
With the advent of local
area networks of desktop scientific workstations, some environments
tentatively explored and employed "cycle harvesting" applying unused
workstations at off-hours to do embarrassingly parallel tasks; usually
running the same program on many different machines at the same
time with different input data sets. Workstation clusters were explored
by the University of Wisconsin, UC Berkeley, and other sites demonstrating
the possibilities of clustered computing. And research was conducted
in the development of new network technology that might be used
to integrate workstations.
But problems of cost,
customized hardware, and proprietary software limited their installed
base and use. At this time, a new class of clustered computing system
was devised by a small group at the NASA Goddard Space Flight Center
that overcame these difficulties and explored what has become the
most rapidly growing type of parallel processing.
At the beginning of
1994, the Beowulf project undertook to assemble a cluster of PCs
and to evaluate their utility as a scalable system for scientific
computation using only mass market commodity off-the-shelf hardware
and widely available open source software. It was only at that time
that the capability of such hardware was good enough and the cost
low enough to potentially enable this new class of computing: PC
clusters.
The Beowulf project adopted
the inchoate Linux operating system because it was free and open
software, avoiding the legal complications of using Unix and providing
the software for making necessary changes. The Beowulf project developed
the majority of Ethernet drivers included today as well as channel
bonding to support multiple simultaneous and many other low level
tools for managing clusters of PCs.
Equally important was
the pathfinding work in applying these systems to real world scientific
applications. It was found that in some cases Beowulf could equal
in performance that of much more costly machines and in many cases
they provided a price-performance advantage of an order-of-magnitude
or more.
Through a series of successive
generations of systems, these capabilities grew with more than a
Gflops sustained operation being achieved on a system costing less
than $50K in 1996 and 10 Gflops on 120 processors in 1997, also
winning the Gordon Bell prize for price-performance two years in
a row. Today, a number of commercial concerns provide low cost Beowulf-class
systems and other PC clusters with an installed base in industry,
government labs, and academia. A number of these systems are now
included on the Top 500 list of the world's most powerful computer
systems.
This presentation will
discuss the motivation and importance of Beowulf-class computing,
its hardware and software elements, and its history from inception
of 16 processor systems to present day systems up to a thousand
processors.
Biographical Sketch:
Dr. Thomas Sterling
received his Ph.D. as a Hertz Fellow from MIT in 1984 and has held
research scientist positions with the Harris Corporation's Advanced
Technology Department, the IDA Supercomputing Research Center, and
the USRA Center of Excellence in Space Data and Information Sciences.
In 1996, Dr.Sterling began a joint appointment with the NASA Jet
Propulsion Laboratory and the California Institute of Technology.
He is a Principle Scientist in the Jet Propulsion Laboratory's High
Performance Computing group, and he is a Faculty Associate at the
California Institute of Technology's Center for Advanced Computing
Research.
For the last 20 years,
Dr. Sterling has engaged in applied research in parallel processing
hardware and software systems for high performance computing. He
was a developer of the Concert shared memory multiprocessor, the
YARC static dataflow computer, and the Associative Template Dataflow
computer concept, and he has conducted extensive studies of distributed
shared memory cache coherence systems.
In 1994, Dr. Sterling
led the team at the NASA Goddard Space Flight Center that developed
the first Beowulf-class PC clusters including the Ethernet networking
software for the Linux operating system. In 1999, he co-authored
the MIT Press book "How to Build a Beowulf".
Since 1994, Dr. Sterling
has been a leader in the national Petaflops initiative, chairing
two workshops on Petaflops systems development and chairing the
subgroup on the Petaflops computing implementation plan for the
President's Information Technology Advisory Committee. He chaired
both the first and second Conferences on Enabling Petaflops Computing
in 1994 and 1999. He is also an author of the book, "Enabling Technologies
for Petaflops Computing" published by MIT Press in 1995.
Dr. Sterling is the
Principal Investigator for the interdisciplinary Hybrid Technology
Multithreaded (HTMT) architecture research project sponsored by
NASA, NSA, NSF, and DARPA involving a collaboration of more than
a dozen cooperating research institutions. The HTMT project is developing
an adaptive, latency tolerant, Petaflops-scale computer employing
superconductor, optical, and processor-in-memory technologies.
Dr. Sterling holds six
patents, and was the winner of 1997 Gordon Bell Prize for Price
Performance.
For further
information:
How To Build A Beowulf , Sterling, Salmon, et. al.
Beolinks
(Caltech)
The Beowulf
Project (CACR-Caltech)
Beowulf.org
The
Beowulf Underground
The
Legend of Beowulf (Anglo-Saxon epic poem)
A streaming
video recording of this presentation will be available after the
lecture through Dr. Dobb's TechNetCast (http://www.technetcast.com).
Check site for schedule and details.

|