skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications

Abstract

Numerical reproducibility and stability of large scale scientific simulations, especially climate modeling, on distributed memory parallel computers are becoming critical issues. In particular, global summation of distributed arrays is most susceptible to rounding errors, and their propagation and accumulation cause uncertainty in final simulation results. We analyzed several accurate summation methods and found that two methods are particularly effective to improve (ensure) reproducibility and stability -- Kahan's self-compensated summation and Bailey's double-double precision summation. We provide an MPI operator MPI(underscore)SUMMDD to work with MPI collective operations to ensure a scalable implementation on a large number of processors. The final methods are particularly simple to adopt in practical codes.

Authors:
 [1];
  1. (Helen)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab., CA (US)
Sponsoring Org.:
USDOE Director, Office of Science. Office of Biological and Environmental Research (US)
OSTI Identifier:
787080
Report Number(s):
LBNL-45040
Journal ID: ISSN 0920-8542; R&D Project: K11501; TRN: AH200134%%19
DOE Contract Number:  
AC03-76SF00098
Resource Type:
Journal Article
Journal Name:
Journal of Supercomputing
Additional Journal Information:
Journal Volume: 18; Journal Issue: 3; Other Information: Journal Publication Date: March 2001; PBD: 1 May 2000; Journal ID: ISSN 0920-8542
Country of Publication:
United States
Language:
English
Subject:
54 ENVIRONMENTAL SCIENCES; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; CLIMATE MODELS; COMPUTERIZED SIMULATION; ACCURACY; PARALLEL PROCESSING

Citation Formats

He, Yun, and Ding, Chris H.Q. Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications. United States: N. p., 2000. Web. doi:10.1145/335231.335253.
He, Yun, & Ding, Chris H.Q. Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications. United States. doi:10.1145/335231.335253.
He, Yun, and Ding, Chris H.Q. Mon . "Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications". United States. doi:10.1145/335231.335253.
@article{osti_787080,
title = {Using accurate arithmetics to improve numerical reproducibility and stability in parallel applications},
author = {He, Yun and Ding, Chris H.Q.},
abstractNote = {Numerical reproducibility and stability of large scale scientific simulations, especially climate modeling, on distributed memory parallel computers are becoming critical issues. In particular, global summation of distributed arrays is most susceptible to rounding errors, and their propagation and accumulation cause uncertainty in final simulation results. We analyzed several accurate summation methods and found that two methods are particularly effective to improve (ensure) reproducibility and stability -- Kahan's self-compensated summation and Bailey's double-double precision summation. We provide an MPI operator MPI(underscore)SUMMDD to work with MPI collective operations to ensure a scalable implementation on a large number of processors. The final methods are particularly simple to adopt in practical codes.},
doi = {10.1145/335231.335253},
journal = {Journal of Supercomputing},
issn = {0920-8542},
number = 3,
volume = 18,
place = {United States},
year = {2000},
month = {5}
}