skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Compiler-Enhanced Incremental Checkpointing for OpenMP Applications

Abstract

As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures, enabling applications to periodically save their state and restart computation after a failure. Although a variety of automated system-level checkpointing solutions are currently available to HPC users, manual application-level checkpointing remains more popular due to its superior performance. This paper improves performance of automated checkpointing via a compiler analysis for incremental checkpointing. This analysis, which works with both sequential and OpenMP applications, reduces checkpoint sizes by as much as 80% and enables asynchronous checkpointing.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
944289
Report Number(s):
LLNL-CONF-400662
TRN: US200902%%629
DOE Contract Number:  
W-7405-ENG-48
Resource Type:
Conference
Resource Relation:
Conference: Presented at: International Conference on Supercomputing, Kos, Greece, Jun 07 - Jun 12, 2008
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS; PERFORMANCE; COMPUTER CODES; SUPERCOMPUTERS; FAULT TOLERANT COMPUTERS

Citation Formats

Bronevetsky, G, Marques, D, Pingali, K, Rugina, R, and McKee, S A. Compiler-Enhanced Incremental Checkpointing for OpenMP Applications. United States: N. p., 2008. Web. doi:10.1109/IPDPS.2009.5160999.
Bronevetsky, G, Marques, D, Pingali, K, Rugina, R, & McKee, S A. Compiler-Enhanced Incremental Checkpointing for OpenMP Applications. United States. https://doi.org/10.1109/IPDPS.2009.5160999
Bronevetsky, G, Marques, D, Pingali, K, Rugina, R, and McKee, S A. 2008. "Compiler-Enhanced Incremental Checkpointing for OpenMP Applications". United States. https://doi.org/10.1109/IPDPS.2009.5160999. https://www.osti.gov/servlets/purl/944289.
@article{osti_944289,
title = {Compiler-Enhanced Incremental Checkpointing for OpenMP Applications},
author = {Bronevetsky, G and Marques, D and Pingali, K and Rugina, R and McKee, S A},
abstractNote = {As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures, enabling applications to periodically save their state and restart computation after a failure. Although a variety of automated system-level checkpointing solutions are currently available to HPC users, manual application-level checkpointing remains more popular due to its superior performance. This paper improves performance of automated checkpointing via a compiler analysis for incremental checkpointing. This analysis, which works with both sequential and OpenMP applications, reduces checkpoint sizes by as much as 80% and enables asynchronous checkpointing.},
doi = {10.1109/IPDPS.2009.5160999},
url = {https://www.osti.gov/biblio/944289}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 21 00:00:00 EST 2008},
month = {Mon Jan 21 00:00:00 EST 2008}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: