skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer

Abstract

With the increase of scale and complexity seen in a variety of leadership-class scientific computation and simulation applications, it has become more important to understand their I/O performance characteristics. The user-observed performance is a combination of properties of how the application is using the HPC facility, as well as how others' use of the facility causes variability in the static machine capabilities. Our work leverages statistical analysis of I/O performance data gathered with fine time resolution over a full week from Titan supercomputer. Based on observed properties of the distribution of I/O latencies, we build a three-state hidden Markov model (HMM) to characterize the end-to-end I/O performance on Titan. We parameterize our model using part of the field-gathered I/O performance data and validate it against the rest. The validation results demonstrate that our model can capture the dynamics of end-to-end I/O performance on Titan accurately.

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1569392
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: IEEE International Conference on High Performance Computing and Communications; IEEE International Conference on Smart City; IEEE International Conference on Data Science and Systems (HPCC/SmartCity/DSS) - Bangkok, , Thailand - 12/18/2017 10:00:00 AM-12/20/2017 5:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Wan, Lipeng, Wolf, Matthew D., Wang, Feiyi, Choi, Jong Youl, Ostrouchov, George, and Klasky, Scott A. Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer. United States: N. p., 2017. Web. doi:10.1109/HPCC-SmartCity-DSS.2017.1.
Wan, Lipeng, Wolf, Matthew D., Wang, Feiyi, Choi, Jong Youl, Ostrouchov, George, & Klasky, Scott A. Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer. United States. https://doi.org/10.1109/HPCC-SmartCity-DSS.2017.1
Wan, Lipeng, Wolf, Matthew D., Wang, Feiyi, Choi, Jong Youl, Ostrouchov, George, and Klasky, Scott A. 2017. "Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer". United States. https://doi.org/10.1109/HPCC-SmartCity-DSS.2017.1. https://www.osti.gov/servlets/purl/1569392.
@article{osti_1569392,
title = {Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer},
author = {Wan, Lipeng and Wolf, Matthew D. and Wang, Feiyi and Choi, Jong Youl and Ostrouchov, George and Klasky, Scott A.},
abstractNote = {With the increase of scale and complexity seen in a variety of leadership-class scientific computation and simulation applications, it has become more important to understand their I/O performance characteristics. The user-observed performance is a combination of properties of how the application is using the HPC facility, as well as how others' use of the facility causes variability in the static machine capabilities. Our work leverages statistical analysis of I/O performance data gathered with fine time resolution over a full week from Titan supercomputer. Based on observed properties of the distribution of I/O latencies, we build a three-state hidden Markov model (HMM) to characterize the end-to-end I/O performance on Titan. We parameterize our model using part of the field-gathered I/O performance data and validate it against the rest. The validation results demonstrate that our model can capture the dynamics of end-to-end I/O performance on Titan accurately.},
doi = {10.1109/HPCC-SmartCity-DSS.2017.1},
url = {https://www.osti.gov/biblio/1569392}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Dec 01 00:00:00 EST 2017},
month = {Fri Dec 01 00:00:00 EST 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: