DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Towards reversible basic linear algebra subprograms: A performance study

Abstract

Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case study is presented here in reversing a computational core, namely, Basic Linear Algebra Subprograms, which is widely used in scientific applications. A new Reversible BLAS (RBLAS) library interface has been designed, and a prototype has been implemented with two modes: (1) a memory-mode in which reversibility is obtained by checkpointing to memory in forward and restoring from memory in reverse, and (2) a computational-mode in which nothing is saved in the forward, but restoration is done entirely via inverse computation in reverse. The article is focused on detailed performance benchmarking to evaluate the runtime dynamics and performance effects, comparing reversible computation with checkpointing on both traditional CPU platforms and recent GPU accelerator platforms. For BLAS Level-1 subprograms, data indicates over an order of magnitude better speed of reversible computation compared to checkpointing. For BLAS Level-2 and Level-3, a more complex tradeoff ismore » observed between reversible computation and checkpointing, depending on computational and memory complexities of the subprograms.« less

Authors:
 [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1209196
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
Transactions on Computational Science
Additional Journal Information:
Journal Volume: 8911; Journal ID: ISSN 1866-4733
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; reversible computation; linear algebra; checkpointing; runtime performance; memory effects

Citation Formats

Perumalla, Kalyan S., and Yoginath, Srikanth B. Towards reversible basic linear algebra subprograms: A performance study. United States: N. p., 2014. Web. doi:10.1007/978-3-662-45711-5_4.
Perumalla, Kalyan S., & Yoginath, Srikanth B. Towards reversible basic linear algebra subprograms: A performance study. United States. https://doi.org/10.1007/978-3-662-45711-5_4
Perumalla, Kalyan S., and Yoginath, Srikanth B. Sat . "Towards reversible basic linear algebra subprograms: A performance study". United States. https://doi.org/10.1007/978-3-662-45711-5_4. https://www.osti.gov/servlets/purl/1209196.
@article{osti_1209196,
title = {Towards reversible basic linear algebra subprograms: A performance study},
author = {Perumalla, Kalyan S. and Yoginath, Srikanth B.},
abstractNote = {Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case study is presented here in reversing a computational core, namely, Basic Linear Algebra Subprograms, which is widely used in scientific applications. A new Reversible BLAS (RBLAS) library interface has been designed, and a prototype has been implemented with two modes: (1) a memory-mode in which reversibility is obtained by checkpointing to memory in forward and restoring from memory in reverse, and (2) a computational-mode in which nothing is saved in the forward, but restoration is done entirely via inverse computation in reverse. The article is focused on detailed performance benchmarking to evaluate the runtime dynamics and performance effects, comparing reversible computation with checkpointing on both traditional CPU platforms and recent GPU accelerator platforms. For BLAS Level-1 subprograms, data indicates over an order of magnitude better speed of reversible computation compared to checkpointing. For BLAS Level-2 and Level-3, a more complex tradeoff is observed between reversible computation and checkpointing, depending on computational and memory complexities of the subprograms.},
doi = {10.1007/978-3-662-45711-5_4},
journal = {Transactions on Computational Science},
number = ,
volume = 8911,
place = {United States},
year = {Sat Dec 06 00:00:00 EST 2014},
month = {Sat Dec 06 00:00:00 EST 2014}
}