Tech

Race Toward Exascale Supercomputers is Starting, U.S. Scrambles to Compete

Staff Reporter
First Posted: Dec 11, 2013 05:47 PM EST

Significant changes in high-performance computing software and hardware have been on the horizon for nearly a decade. What’s worrisome, says Pete Beckman, is that in that time the HPC community in the US hasn’t been able to start to definitively mobilize a collective movement toward exascale, the next frontier of supercomputer capability.

Beckman is director of the Exascale Technology and Computing Institute at Argonne National Laboratory, near Chicago, Illinois, US “We’ve known there is huge change coming,” he says. “But as a community looking for support from the US Department of Energy, the National Science Foundation, and others, we haven’t been able to come together and formalize a plan.”

Beckman’s early work, including collaboration with several colleagues, led to the International Exascale Software Project (IESP), which created national and international dialogue around solving the exascale problem. “IESP brought us together to talk about exascale and discuss agendas, but given current trajectories we’re well past the point of imagining that all of the technology must come from a single nation. We’ll be much better off solving these problems collaboratively and sharing pieces of the software,” Beckman says.

Application developers have been collaboratively developing parts of software for decades. If you look at cosmology or biology applications, development teams are spread across the globe. Systems software, however, is a different animal. Rarely thought of as a collaborative endeavor, it’s usually developed and honed in one place. Beckman has been working to shift this traditional model to one that is much more collaborative.

Big data in a variety of domains is now perhaps the most important driver of high-end, high-performance computing and simulation. Scientists, engineers, and researchers regularly want to analyze and look for patterns within extreme data sets. Recognizing the need to account for this shift, Beckman and Jack Dongarra have put together a series of collaborative workshops to investigate architectures and software.

Dongarra was awarded the 2013 Ken Kennedy Award at SC13 in November. He is an American University Distinguished Professor of Computer Science in the electrical engineering and computer science department at the University of Tennessee, US. Working with partners in France, Japan, and other countries, Beckman and Dongarra’s goal is to determine how to add to or adapt current technologies to account for big data.

Along with big data initiatives, the US Department of Energy is funding several research projects aimed at developing different parts of the software stack. Two projects are focused on the operating system and run time environment: Hobbes, headed by Ron Brightwell at Sandia National Laboratory and Baarney Maccabe at Oak Ridge National Laboratory, and Argo, headed by Beckman and Marc Snir - director of Argonne's Mathematics and Computer Science division.

Work on the Argo project is split amongst four research areas and includes three national labs – Lawrence Livermore, Pacific Northwest, and Argonne – and the US universities of Illinois, Oregon, Tennessee, and Chicago. Beckman recently discussed the Argo project at SC13 in Denver, Colorado, US.

“Most hardware vendors are targeting what they’ll need in order to deliver their next chip today, not five years from now. We're working directly with these vendors and including them in the research we’re doing now.”

This means Beckman and the Argo team are in the unique position of knowing exactly what each vendor’s roadmap entails, enabling them to propose a system architecture that would work well across several different platforms.

“We let the vendors know from the beginning that our plan is to develop an open source system including the APIs,” Beckman says, “Power adjustment on future machines is a key concern, and there is no competitive advantage for each vendor to develop their own API.”

In November 2013, the Argo team met with vendors for the first time and presented their initial system design – including machine management, runtime environment, memory hierarchy, and power management. “We've already gotten feedback from the vendors,” says Beckman. “Now that we have our initial design, we will likely have a more formal version ready in the spring.”

Developers in Japan, Europe, and China have started similar projects, resulting in what Beckman and others in the HPC community call ‘healthy competition.’ These countries may have the competitive advantage, however, because their governments are actively involved in the funding, planning, and execution of the projects. In contrast, right now in the US two exascale bills are stuck in committee before congress.

“We will get to a point in a few years where there will be a lot of energy and buzz happening again in the US,” Beckman says, “It is always a cycle. There is an incredible push for leadership in a couple of new areas now and lots of exciting ideas, but we are still pretty far off.” -- by Amber Hamon, © i SGTW

See Now: NASA's Juno Spacecraft's Rendezvous With Jupiter's Mammoth Cyclone

More on SCIENCEwr