skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Tuning HDF5 subfiling performance on parallel file systems

Abstract

Subfiling is a technique used on parallel file systems to reduce locking and contention issues when multiple compute nodes interact with the same storage target node. Subfiling provides a compromise between the single shared file approach that instigates the lock contention problems on parallel file systems and having one file per process, which results in generating a massive and unmanageable number of files. In this paper, we evaluate and tune the performance of recently implemented subfiling feature in HDF5. In specific, we explain the implementation strategy of subfiling feature in HDF5, provide examples of using the feature, and evaluate and tune parallel I/O performance of this feature with parallel file systems of the Cray XC40 system at NERSC (Cori) that include a burst buffer storage and a Lustre disk-based storage. We also evaluate I/O performance on the Cray XC30 system, Edison, at NERSC. Our results show performance benefits of 1.2X to 6X performance advantage with subfiling compared to writing a single shared HDF5 file. We present our exploration of configurations, such as the number of subfiles and the number of Lustre storage targets to storing files, as optimization parameters to obtain superior I/O performance. Based on this exploration, we discussmore » recommendations for achieving good I/O performance as well as limitations with using the subfiling feature.« less

Authors:
 [1];  [2];  [1];  [3];  [3]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Intel Corp. (United States)
  3. The HDF Group (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1398484
DOE Contract Number:  
AC02-05CH11231
Resource Type:
Conference
Resource Relation:
Conference: Cray User Group Meeting, Redmond, WA (United States), 8-11 May 2017
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Byna, Suren, Chaarawi, Mohamad, Koziol, Quincey, Mainzer, John, and Willmore, Frank. Tuning HDF5 subfiling performance on parallel file systems. United States: N. p., 2017. Web.
Byna, Suren, Chaarawi, Mohamad, Koziol, Quincey, Mainzer, John, & Willmore, Frank. Tuning HDF5 subfiling performance on parallel file systems. United States.
Byna, Suren, Chaarawi, Mohamad, Koziol, Quincey, Mainzer, John, and Willmore, Frank. Fri . "Tuning HDF5 subfiling performance on parallel file systems". United States. doi:. https://www.osti.gov/servlets/purl/1398484.
@article{osti_1398484,
title = {Tuning HDF5 subfiling performance on parallel file systems},
author = {Byna, Suren and Chaarawi, Mohamad and Koziol, Quincey and Mainzer, John and Willmore, Frank},
abstractNote = {Subfiling is a technique used on parallel file systems to reduce locking and contention issues when multiple compute nodes interact with the same storage target node. Subfiling provides a compromise between the single shared file approach that instigates the lock contention problems on parallel file systems and having one file per process, which results in generating a massive and unmanageable number of files. In this paper, we evaluate and tune the performance of recently implemented subfiling feature in HDF5. In specific, we explain the implementation strategy of subfiling feature in HDF5, provide examples of using the feature, and evaluate and tune parallel I/O performance of this feature with parallel file systems of the Cray XC40 system at NERSC (Cori) that include a burst buffer storage and a Lustre disk-based storage. We also evaluate I/O performance on the Cray XC30 system, Edison, at NERSC. Our results show performance benefits of 1.2X to 6X performance advantage with subfiling compared to writing a single shared HDF5 file. We present our exploration of configurations, such as the number of subfiles and the number of Lustre storage targets to storing files, as optimization parameters to obtain superior I/O performance. Based on this exploration, we discuss recommendations for achieving good I/O performance as well as limitations with using the subfiling feature.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri May 12 00:00:00 EDT 2017},
month = {Fri May 12 00:00:00 EDT 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: