In this Let’s Talk Exascale podcast, Todd Gamblin from LLNL describes how the Spack flexible package manager helps automate the deployment of software on supercomputer systems.
After many hours building software on Lawrence Livermore’s supercomputers, in 2013 Todd Gamblin created the first prototype of a package manager he named Spack (Supercomputer PACKage manager). The tool caught on, and development became a grassroots effort as colleagues began to use the tool. The Spack team at Livermore now includes computer scientists Gregory Becker, Peter Scheibel, Tamara Dahlgren, Gregory Lee, and Matthew LeGendre. The core development team also includes Adam Stewart from UIUC [the University of Illinois at Urbana-Champaign], Massimiliano Culpo from Sylabs, and Scott Wittenburg, Zack Galbreath, and Omar Padron from Kitware. Since its meager beginnings, the Spack project has grown to include over 4,000 scientific software packages, thanks to the efforts of over 550 contributors around the world.
Spack is in the portfolio of the Software Technology research focus area of the US Department of Energy’s Exascale Computing Project (ECP). To learn more, Scott Gibson from the ECP caught up with Gamblin at SC19.
Scott Gibson: Tell me a little bit about your involvement here at SC19 with Spack. What all have you been doing?
Todd Gamblin: Well, we’ve had three BoF’s [Birds of a Feather sessions]. We’ve had a day-long tutorial. We had probably three paper sessions that dealt with Spack. So it’s really been kind of an all-week thing. We even made a special page on Spack.io that shows all the things that Spack is involved with at SC19, so it’s been pretty busy.
Scott Gibson: It’s obviously very popular.
Todd Gamblin: Yeah, we’re trying to get the word out because we want people to contribute to the project and use the tool.
Scott Gibson: This is very exciting. It just won an R&D 100 award, so tell us your feelings about that.
Todd Gamblin: I mean, it’s an honor. It’s great to have an R&D 100 award. We won the regular R&D 100 award and we also got a special recognition for being a market disruptor, which is pretty cool—I think that sort of speaks to why we won the award. I think it was based mostly on Spack’s impact throughout the HPC community. We have users worldwide. We have a lot of supercomputing centers starting to take up Spack to deploy software, and it’s been influential in ECP. And I think all of those things as well as collaborations even with foreign computing centers—like CERN outside of the traditional HPC scene and RIKEN and its Fugaku machine due in 2021—just the broad collaboration, was a big part of the award.
Scott Gibson: I guess in the type of work you do, you never arrive. It’s always an ongoing development process. What would you say about that with respect to Spack? What are you doing with it right now?
Todd Gamblin: Well, I think that’s very true. It’s never done, especially because we’re modeling a software ecosystem that is constantly evolving. Spack itself, the core tool, is constantly evolving because we’re trying to build new features for application developers and software teams, and then just maintaining the packages in Spack. There are 3,500 packages. We merge probably 200 or 300 pull requests every month, so it’s just a constant churn of activity on the site. And we could not do that without a community. So Spack is broader than just ECP. We have contributors from all over. It’s like 450 contributors at this point.
Scott Gibson: Yeah, so what are your interactions like with all the people who help you out with Spack? What does that look like?
Todd Gamblin: It can range. So for core contributors, like our colleagues at Fermilab and Kitware and folks who want to contribute major features, you know, we get pretty closely involved in the technical details for GitHub and for package bumps and things like that. Or people want to just submit a new version. It can be really quick. They can submit a pull request. Anyone can do this, and then we will review it and possibly give you feedback and click merge. And then we have this rolling develop release, and we periodically release vetted versions of Spack.
Scott Gibson: What are you into at the moment?
Todd Gamblin: We just rolled out Spack 0.13. It added a whole bunch of features around facility deployment. And one thing that we’re particularly happy about is we have specific microarchitecture support so we can target our binaries directly at the types of machines we’re deploying under ECP. So we don’t just target Intel; we target Skylake. We don’t just target AMD; we target Naples or Rome—the specific generation of chip—and optimize for that. And I think you’d be surprised at how hard it is to figure that out about a chip. Vendors don’t just give you nice processor names. You have to sort of understand what hardware you’re on and how to talk to the compiler and tell it to optimize for it.
Scott Gibson: Sounds complex. What would you say the legacy’s going to be for Spack, or what would you hope it will be?
Todd Gamblin: We’re really trying to make using HPC systems as easy as it is to use your laptop or a regular Linux cluster. We want to make it simple for people to get on a machine, install the software they need, and get going. And so I think the legacy—if we’re successful with this—is Spack basically sits under all three parts of ECP: Software Technology, Application Development; and we’re heavily involved in HI [Hardware and Integration] for facility deployment. We’re building infrastructure that we will use to have prebuilt packages available for anybody. And I think if we can successfully set that up and keep the maintenance of it going after ECP, then we will have simplified life for a lot of people using HPC machines.
Source: Scott Gibson at the Exascale Computing Project