skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Efficient data IO for a Parallel Global Cloud Resolving Model

Journal Article · · Environmental Modelling & Software

Execution of a Global Cloud Resolving Model (GCRM) at target resolutions of 2-4 km will generate, at a minimum, 10s of Gigabytes of data per variable per snapshot. Writing this data to disk without creating a serious bottleneck in the execution of the GCRM code while also supporting efficient post-execution data analysis is a significant challenge. This paper discusses an Input/Output (IO) application programmer interface (API) for the GCRM that efficiently moves data from the model to disk while maintaining support for community standard formats, avoiding the creation of very large numbers of files, and supporting efficient analysis. Several aspects of the API will be discussed in detail. First, we discuss the output data layout which linearizes the data in a consistent way that is independent of the number of processors used to run the simulation and provides a convenient format for subsequent analyses of the data. Second, we discuss the flexible API interface that enables modelers to easily add variables to the output stream by specifying where in the GCRM code these variables are located and to flexibly configure the choice of outputs and distribution of data across files. The flexibility of the API is designed to allow model developers to add new data fields to the output as the model develops and new physics is added and also provides a mechanism for allowing users of the GCRM code itself to adjust the output frequency and the number of fields written depending on the needs of individual calculations. Third, we describe the mapping to the NetCDF data model with an emphasis on the grid description. Fourth, we describe our messaging algorithms and IO aggregation strategies that are used to achieve high bandwidth while simultaneously writing concurrently from many processors to shared files. We conclude with initial performance results.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1033837
Report Number(s):
PNNL-SA-73200; 25621; KJ0403000; TRN: US201203%%116
Journal Information:
Environmental Modelling & Software, Vol. 26, Issue 12
Country of Publication:
United States
Language:
English