Achieving algorithmic resilience for temporal integration through spectral deferred corrections
Spectral deferred corrections (SDC) is an iterative approach for constructing higherorderaccurate numerical approximations of ordinary differential equations. SDC starts with an initial approximation of the solution defined at a set of Gaussian or spectral collocation nodes over a time interval and uses an iterative application of lowerorder time discretizations applied to a correction equation to improve the solution at these nodes. Each deferred correction sweep increases the formal order of accuracy of the method up to the limit inherent in the accuracy defined by the collocation points. In this paper, we demonstrate that SDC is well suited to recovering from soft (transient) hardware faults in the data. A strategy where extra correction iterations are used to recover from soft errors and provide algorithmic resilience is proposed. Specifically, in this approach the iteration is continued until the residual (a measure of the error in the approximation) is small relative to the residual of the first correction iteration and changes slowly between successive iterations. Here, we demonstrate the effectiveness of this strategy for both canonical test problems and a comprehensive situation involving a mature scientific application code that solves the reacting NavierStokes equations for combustion research.
 Authors:

^{[1]};
^{[2]};
^{[3]};
^{[3]}
 National Renewable Energy Lab. (NREL), Golden, CO (United States). Computational Science Center
 Sandia National Lab. (SNLCA), Livermore, CA (United States). Scalable Modeling and Analysis Dept.
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
 Publication Date:
 Report Number(s):
 NREL/JA2C0062926; NREL/JA2C0068888
Journal ID: ISSN 15593940; ark:/13030/qt7n03t51k
 Grant/Contract Number:
 AC0205CH11231; AC3608GO28308
 Type:
 Accepted Manuscript
 Journal Name:
 Communications in Applied Mathematics and Computational Science
 Additional Journal Information:
 Journal Volume: 12; Journal Issue: 1; Journal ID: ISSN 15593940
 Publisher:
 Mathematical Sciences Publishers
 Research Org:
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); National Renewable Energy Lab. (NREL), Golden, CO (United States); Sandia National Lab. (SNLCA), Livermore, CA (United States)
 Sponsoring Org:
 USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC21)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING; SDC; resilience; time integration; deferred correction; exascale computing; combustion
 OSTI Identifier:
 1436145
Grout, Ray, Kolla, Hemanth, Minion, Michael, and Bell, John. Achieving algorithmic resilience for temporal integration through spectral deferred corrections. United States: N. p.,
Web. doi:10.2140/camcos.2017.12.25.
Grout, Ray, Kolla, Hemanth, Minion, Michael, & Bell, John. Achieving algorithmic resilience for temporal integration through spectral deferred corrections. United States. doi:10.2140/camcos.2017.12.25.
Grout, Ray, Kolla, Hemanth, Minion, Michael, and Bell, John. 2017.
"Achieving algorithmic resilience for temporal integration through spectral deferred corrections". United States.
doi:10.2140/camcos.2017.12.25. https://www.osti.gov/servlets/purl/1436145.
@article{osti_1436145,
title = {Achieving algorithmic resilience for temporal integration through spectral deferred corrections},
author = {Grout, Ray and Kolla, Hemanth and Minion, Michael and Bell, John},
abstractNote = {Spectral deferred corrections (SDC) is an iterative approach for constructing higherorderaccurate numerical approximations of ordinary differential equations. SDC starts with an initial approximation of the solution defined at a set of Gaussian or spectral collocation nodes over a time interval and uses an iterative application of lowerorder time discretizations applied to a correction equation to improve the solution at these nodes. Each deferred correction sweep increases the formal order of accuracy of the method up to the limit inherent in the accuracy defined by the collocation points. In this paper, we demonstrate that SDC is well suited to recovering from soft (transient) hardware faults in the data. A strategy where extra correction iterations are used to recover from soft errors and provide algorithmic resilience is proposed. Specifically, in this approach the iteration is continued until the residual (a measure of the error in the approximation) is small relative to the residual of the first correction iteration and changes slowly between successive iterations. Here, we demonstrate the effectiveness of this strategy for both canonical test problems and a comprehensive situation involving a mature scientific application code that solves the reacting NavierStokes equations for combustion research.},
doi = {10.2140/camcos.2017.12.25},
journal = {Communications in Applied Mathematics and Computational Science},
number = 1,
volume = 12,
place = {United States},
year = {2017},
month = {5}
}