What is a Leadership Computer and How Do We Use Them?
Abstract
The first part of the talk will briefly summarise the Blue Waters Leadership Computer, the world’s first sustained petaflop computing system and some of the Frontier science projects that use it.
The next part will discuss how the Blue Waters workload evolved over the first four years of operation to be one of the primary examples of the convergence of Extreme Computing and Big Data.
The next segment will discuss what are some of the key characteristics and options that make a system an effective and productive supercomputer. These include not just hardware characteristics but also the methods of resource management, resiliency, and balance.
The talk will conclude with some observations on the trends in technology and price performance, but also on some of the challenges those trends are presenting for research teams using large-scale systems.
Note: No registration or RSVP required, just turn up! RCC seminars are free. All welcome.
Bio
Dr William T.C. Kramer is the Principal Investigator and Project Director for the Blue Waters Leadership Computing Project at the National Center for Supercomputing Applications.
Blue Waters is a National Science Foundation-funded project, to deploy the first general purpose, open science, sustained-petaflops supercomputer as a powerful resource for the nation’s researchers.
Blue Waters is the largest system Cray has ever built, almost 50% larger than the next largest Cray system. Blue Waters is one of the world’s most balanced leadership systems with more than 28,000 nodes with x86 and GPU processors, 1.7 PB of main memory and the fastest I/O subsystem in the open research community.
Kramer’s accomplishments are deploying and operating extreme-scale computational systems, data systems, best-of-class facilities and leading intense, high visibility projects. He combines broad and significant technical contributions combined with leadership and management experience in high-performance, interactive and real-time computing, data focused analysis, cyber infrastructure, applications and software development. He has substantial and sustained expertise in managing world-class, trend-setting organisations, a commitment to excellence, a record of fostering the education and development of the next generation of researchers and leaders and a track record for building sustained collaborations and relationships.
Kramer is a Research Professor in the Computer Science Department pursuing research in system performance evaluation, large-scale resiliency and reliability and system resource management.
Previously Kramer was the general manager of the National Energy Research Scientific Computing Center (NERSC), the flagship computing facility of the Department of Energy’s Office of Science at Lawrence Berkeley National Laboratory (LBNL) which Dr Ray Orbach, DOE’s Undersecretary for Science, called “the best run large capacity computing facility in the world”.
Prior to Berkeley Lab, Kramer worked at the NASA Ames Research Center, where he was responsible for all aspects of operations and customer support for NASA's Numerical Aerodynamic Simulator (NAS) supercomputer centre and other large computational projects, as well as starting a major Air Traffic Control Program. He worked at the University of Delaware and Inland Steel Corporation.
Blue Waters will be the 20th supercomputer Kramer deployed and/or manages. Several were first of their kind, including the world’s first production UNIX supercomputer and the first production quality massively parallel system. In addition, he deployed and managed large clusters of workstations, five extremely large data repositories, some of the world’s most intense networks, and other extreme scale systems. He has also been involved with the design, creation and commissioning of six “best of class” HPC facilities.
He holds a BS and MS in computer science from Purdue University, an ME in electrical engineering from the University of Delaware, a PhD in computer science at UC Berkeley and a number of professional certifications, including a Level II IT Project Manager.
His work on system evaluation resulted in the PERCU method of system evaluation. He has authored more than 75 peer reviewed papers, articles and reports and taught university courses at Purdue, Delaware, San Jose State and Berkeley. He has been invited to keynote at dozens of conferences and events.
Kramer has multiple awards from NASA, Berkeley Laboratory, and the Association for Computing Machinery (ACM). He was named one of HPCWire’s “People to Watch in 2005” and Inside HPC’s first “Rockstar of HPC”.