Evolution of the SLATE linear algebra library
- Innovative Computing Laboratory, University of Tennessee, Knoxville, TN, USA
- ECRC (Extreme Computing Research Center), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Hewlett Packard Enterprise (HPE), Spring, TX, USA
- Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
- Ansys, Canonsburg, PA, USA
- NVIDIA, Santa Clara, CA, USA
- Oak Ridge National Laboratory, Oak Ridge, TN, USA
SLATE (Software for Linear Algebra Targeting Exascale) is a distributed, dense linear algebra library targeting both CPU-only and GPU-accelerated systems, developed over the course of the Exascale Computing Project (ECP). While it began with several documents setting out its initial design, significant design changes occurred throughout its development. In some cases, these were anticipated: an early version used a simple consistency flag that was later replaced with a full-featured consistency protocol. In other cases, performance limitations and software and hardware changes prompted a redesign. Sequential communication tasks were parallelized; host-to-host MPI calls were replaced with GPU device-to-device MPI calls; more advanced algorithms such as Communication Avoiding LU and the Random Butterfly Transform (RBT) were introduced. Early choices that turned out to be cumbersome, error prone, or inflexible have been replaced with simpler, more intuitive, or more flexible designs. Applications have been a driving force, prompting a lighter weight queue class, nonuniform tile sizes, and more flexible MPI process grids. Of paramount importance has been building a portable library that works across several different GPU architectures – AMD, Intel, and NVIDIA – while keeping a clean and maintainable codebase. Here we explore the evolving design choices and their effects, both in terms of performance and software sustainability.
- Sponsoring Organization:
- USDOE
- OSTI ID:
- 2479010
- Journal Information:
- International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 1 Vol. 39; ISSN 1094-3420
- Publisher:
- SAGE PublicationsCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
A survey of MPI usage in the US exascale computing project
Understanding the use of message passing interface in exascale proxy applications