Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications

Xie, Bing; Oral, Sarp; Zimmer, Christopher; Chase, Jeffrey; Choi, Jong Youl; Dillow, David; Klasky, Scott A.; Lofstead, Gerald; Podhorszki, Norbert

doi:10.1145/3335205

Title: Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications

Abstract

This article studies the I/O write behaviors of the Titan supercomputer and its Lustre parallel file stores under production load. The results can inform the design, deployment, and configuration of file systems along with the design of I/O software in the application, operating system, and adaptive I/O libraries.We propose a statistical benchmarking methodology to measure write performance across I/O configurations, hardware settings, and system conditions. Moreover, we introduce two relative measures to quantify the write-performance behaviors of hardware components under production load. In addition to designing experiments and benchmarking on Titan, we verify the experimental results on one real application and one real application I/O kernel, XGC and HACC IO, respectively. These two are representative and widely used to address the typical I/O behaviors of applications.In summary, we find that Titan’s I/O system is variable across the machine at fine time scales. This variability has two major implications. First, stragglers lessen the benefit of coupled I/O parallelism (striping). Peak median output bandwidths are obtained with parallel writes to many independent files, with no striping or write sharing of files across clients (compute nodes). I/O parallelism is most effective when the application—or its I/O libraries—distributes the I/O load so that eachmore »« less

Authors:

Xie, Bing ^[1];

^[1];

^[1]; Chase, Jeffrey ^[2];

^[1]; Dillow, David ^[3];

^[1]; Lofstead, Gerald ^[4];

^[1]

Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Duke Univ., Durham, NC (United States)
None
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

Publication Date:: Wed Feb 05 00:00:00 EST 2020

Research Org.:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

Sponsoring Org.:: USDOE Office of Energy Efficiency and Renewable Energy (EERE), Renewable Power Office. Wind Energy Technologies Office; National Science Foundation (NSF); USDOE National Nuclear Security Administration (NNSA)

OSTI Identifier:: 1607202

Alternate Identifier(s):: OSTI ID: 1618106

Report Number(s):: SAND-2019-9925J
Journal ID: ISSN 1553-3077

Grant/Contract Number:: AC05-00OR22725; AC04-94AL85000; CNS-1245997; NA0003525

Resource Type:: Accepted Manuscript

Journal Name:: ACM Transactions on Storage

Additional Journal Information:: Journal Volume: 15; Journal Issue: 4; Journal ID: ISSN 1553-3077

Publisher:: Association for Computing Machinery (ACM)

Country of Publication:: United States

Language:: English

Subject:: 97 MATHEMATICS AND COMPUTING

Citation Formats


                    Xie, Bing, Oral, Sarp, Zimmer, Christopher, Chase, Jeffrey, Choi, Jong Youl, Dillow, David, Klasky, Scott A., Lofstead, Gerald, and Podhorszki, Norbert. Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications.  United States: N. p., 2020. 
Web.  doi:10.1145/3335205.

Copy to clipboard


                    Xie, Bing, Oral, Sarp, Zimmer, Christopher, Chase, Jeffrey, Choi, Jong Youl, Dillow, David, Klasky, Scott A., Lofstead, Gerald, & Podhorszki, Norbert. Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications.  United States.  https://doi.org/10.1145/3335205

Copy to clipboard


                    Xie, Bing, Oral, Sarp, Zimmer, Christopher, Chase, Jeffrey, Choi, Jong Youl, Dillow, David, Klasky, Scott A., Lofstead, Gerald, and Podhorszki, Norbert. Wed .  
"Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications".  United States.  https://doi.org/10.1145/3335205.  https://www.osti.gov/servlets/purl/1607202.

Copy to clipboard


                    
@article{osti_1607202,

  title        = {Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications},

  author       = {Xie, Bing and Oral, Sarp and Zimmer, Christopher and Chase, Jeffrey and Choi, Jong Youl and Dillow, David and Klasky, Scott A. and Lofstead, Gerald and Podhorszki, Norbert},

  abstractNote = {This article studies the I/O write behaviors of the Titan supercomputer and its Lustre parallel file stores under production load. The results can inform the design, deployment, and configuration of file systems along with the design of I/O software in the application, operating system, and adaptive I/O libraries.We propose a statistical benchmarking methodology to measure write performance across I/O configurations, hardware settings, and system conditions. Moreover, we introduce two relative measures to quantify the write-performance behaviors of hardware components under production load. In addition to designing experiments and benchmarking on Titan, we verify the experimental results on one real application and one real application I/O kernel, XGC and HACC IO, respectively. These two are representative and widely used to address the typical I/O behaviors of applications.In summary, we find that Titan’s I/O system is variable across the machine at fine time scales. This variability has two major implications. First, stragglers lessen the benefit of coupled I/O parallelism (striping). Peak median output bandwidths are obtained with parallel writes to many independent files, with no striping or write sharing of files across clients (compute nodes). I/O parallelism is most effective when the application—or its I/O libraries—distributes the I/O load so that each target stores files for multiple clients and each client writes files on multiple targets in a balanced way with minimal contention. Second, our results suggest that the potential benefit of dynamic adaptation is limited. In particular, it is not fruitful to attempt to identify “good locations” in the machine or in the file system: component performance is driven by transient load conditions and past performance is not a useful predictor of future performance. For example, we do not observe diurnal load patterns that are predictable.},

  doi          = {10.1145/3335205},

  journal      = {ACM Transactions on Storage},

  number       = 4,

  volume       = 15,

  place        = {United States},

  year         = {Wed Feb 05 00:00:00 EST 2020},

  month        = {Wed Feb 05 00:00:00 EST 2020}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Accepted Manuscript (DOE)

Publisher's Version of Record

https://doi.org/10.1145/3335205

Other availability

Search WorldCat to find libraries that may hold this journal

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Understanding I/O workload characteristics of a Peta-scale storage system
journal, November 2014

Kim, Youngjae; Gunasekaran, Raghul
The Journal of Supercomputing, Vol. 71, Issue 3
DOI: 10.1007/s11227-014-1321-8

Design implications for enterprise storage systems via multi-dimensional trace analysis
conference, January 2011

Chen, Yanpei; Srinivasan, Kiran; Goodson, Garth
Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles - SOSP '11
DOI: 10.1145/2043556.2043562

Parallel I/O performance: From events to ensembles
conference, April 2010

Uselton, Andrew; Howison, Mark; Wright, Nicholas J.
2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)
DOI: 10.1109/IPDPS.2010.5470424

Terascale direct numerical simulations of turbulent combustion using S3D
journal, January 2009

Chen, J. H.; Choudhary, A.; de Supinski, B.
Computational Science & Discovery, Vol. 2, Issue 1
DOI: 10.1088/1749-4699/2/1/015001

VAXcluster: a closely-coupled distributed system
journal, May 1986

Kronenberg, Nancy P.; Levy, Henry M.; Strecker, William D.
ACM Transactions on Computer Systems (TOCS), Vol. 4, Issue 2
DOI: 10.1145/214419.214421

Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark
conference, November 2008

Shan, Hongzhang; Antypas, Katie; Shalf, John
2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
DOI: 10.1109/SC.2008.5222721

EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization
conference, September 2011

Tian, Yuan; Klasky, Scott; Abbasi, Hasan
2011 IEEE International Conference on Cluster Computing (CLUSTER)
DOI: 10.1109/CLUSTER.2011.18

Understanding and Improving Computational Science Storage Access through Continuous Characterization
journal, October 2011

Carns, Philip; Harms, Kevin; Allcock, William
ACM Transactions on Storage, Vol. 7, Issue 3, p. 1-26
DOI: 10.1145/2027066.2027068

Machine Learning Predictions of Runtime and IO Traffic on High-End Clusters
conference, September 2016

McKenna, Ryan; Herbein, Stephen; Moody, Adam
2016 IEEE International Conference on Cluster Computing (CLUSTER)
DOI: 10.1109/CLUSTER.2016.58

A non-staggered, conservative, , finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries
journal, November 2004

Chacón, L.
Computer Physics Communications, Vol. 163, Issue 3
DOI: 10.1016/j.cpc.2004.08.005

Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems
conference, November 2011

Dillow, David A.; Shipman, Galen M.; Oral, Sarp
2011 IEEE 30th International Performance Computing and Communications Conference (IPCCC), 30th IEEE International Performance Computing and Communications Conference
DOI: 10.1109/PCCC.2011.6108062

Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS
journal, August 2013

Liu, Qing; Logan, Jeremy; Tian, Yuan
Concurrency and Computation: Practice and Experience, Vol. 26, Issue 7
DOI: 10.1002/cpe.3125

Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction
conference, November 2014

Dorier, Matthieu; Ibrahim, Shadi; Antoniu, Gabriel
SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
DOI: 10.1109/SC.2014.56

Spontaneous rotation sources in a quiescent tokamak edge plasma
journal, June 2008

Chang, C. S.; Ku, S.
Physics of Plasmas, Vol. 15, Issue 6
DOI: 10.1063/1.2937116

24/7 Characterization of petascale I/O workloads
conference, August 2009

Carns, Philip; Latham, Robert; Ross, Robert
2009 IEEE International Conference on Cluster Computing and Workshops
DOI: 10.1109/CLUSTR.2009.5289150

Managing Variability in the IO Performance of Petascale Storage Systems
conference, November 2010

Lofstead, Jay; Zheng, Fang; Liu, Qing
2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
DOI: 10.1109/SC.2010.32

A Multiplatform Study of I/O Behavior on Petascale Supercomputers
conference, January 2015

Luu, Huong; Winslett, Marianne; Gropp, William
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '15
DOI: 10.1145/2749246.2749269

Adaptable, metadata rich IO methods for portable high performance IO
conference, May 2009

Lofstead, Jay; Zheng, Fang; Klasky, Scott
Distributed Processing (IPDPS), 2009 IEEE International Symposium on Parallel & Distributed Processing
DOI: 10.1109/IPDPS.2009.5161052

Predicting Output Performance of a Petascale Supercomputer
conference, January 2017

Xie, Bing; Huang, Yezhou; Chase, Jeffrey S.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '17
DOI: 10.1145/3078597.3078614

Similar Records in DOE PAGES and OSTI.GOV collections:

Characterizing output bottlenecks in a supercomputer

Conference Xie, Bing ; Chase, Jeffrey ; Dillow, David A ; ...

Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more »« less
Characterizing output bottlenecks in a supercomputer

Conference Xie, Bing ; Chase, Jeffrey ; Dillow, David A ; ...

Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more »« less
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems

Conference Wadhwa, Bharti ; Paul, Arnab K. ; Neuwirth, Sarah ; ...

Parallel I/O performance is crucial to sustaining scientific applications on large-scale High-Performance Computing (HPC) systems. However, I/O load imbalance in the underlying distributed and shared storage systems can significantly reduce overall application performance. There are two conflicting challenges to mitigate this load imbalance: (i) optimizing systemwide data placement to maximize the bandwidth advantages of distributed storage servers, i.e., allocating I/O resources efficiently across applications and job runs; and (ii) optimizing client-centric data movement to minimize I/O load request latency between clients and servers, i.e., allocating I/O resources efficiently in service to a single application and job run. Moreover, existing approachesmore »« less
https://doi.org/10.1109/IPDPS.2019.00070

Full Text Available
Making resonance a common case: a high-performance implementation of collective I/O on parallel file systems

Conference Davis, Marion Kei ; Zhang, Xuechen ; Jiang, Song

Collective I/O is a widely used technique to improve I/O performance in parallel computing. It can be implemented as a client-based or server-based scheme. The client-based implementation is more widely adopted in MPI-IO software such as ROMIO because of its independence from the storage system configuration and its greater portability. However, existing implementations of client-side collective I/O do not take into account the actual pattern offile striping over multiple I/O nodes in the storage system. This can cause a significant number of requests for non-sequential data at I/O nodes, substantially degrading I/O performance. Investigating the surprisingly high I/O throughput achievedmore »« less
Full Text Available
Exploiting Lustre File Joining for Effective Collective IO

Conference Yu, Weikuan ; Vetter, Jeffrey S ; Canon, Richard Shane ; ...

Lustre is a parallel file system that presents high aggregated IO bandwidth by striping file extents across many storage devices. However, our experiments indicate excessively wide striping can cause performance degradation. Lustre supports an innovative file joining feature that joins files in place. To mitigate striping overhead and benefit collective IO, we propose two techniques: split writing and hierarchical striping. In split writing, a file is created as separate subfiles, each of which is striped to only a few storage devices. They are joined as a single file at the file close time. Hierarchical striping builds on top of splitmore »« less

Similar Records

Title: Characterizing Output Bottlenecks of a Production Supercomputer: Analysis and Implications

Abstract

Citation Formats

Understanding I/O workload characteristics of a Peta-scale storage system journal, November 2014

Design implications for enterprise storage systems via multi-dimensional trace analysis conference, January 2011

Parallel I/O performance: From events to ensembles conference, April 2010

Terascale direct numerical simulations of turbulent combustion using S3D journal, January 2009

VAXcluster: a closely-coupled distributed system journal, May 1986

Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark conference, November 2008

EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization conference, September 2011

Understanding and Improving Computational Science Storage Access through Continuous Characterization journal, October 2011

Machine Learning Predictions of Runtime and IO Traffic on High-End Clusters conference, September 2016

A non-staggered, conservative, , finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries journal, November 2004

Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems conference, November 2011

Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS journal, August 2013

Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction conference, November 2014

Spontaneous rotation sources in a quiescent tokamak edge plasma journal, June 2008

24/7 Characterization of petascale I/O workloads conference, August 2009

Managing Variability in the IO Performance of Petascale Storage Systems conference, November 2010

A Multiplatform Study of I/O Behavior on Petascale Supercomputers conference, January 2015

Adaptable, metadata rich IO methods for portable high performance IO conference, May 2009

Predicting Output Performance of a Petascale Supercomputer conference, January 2017

Understanding I/O workload characteristics of a Peta-scale storage system
journal, November 2014

Design implications for enterprise storage systems via multi-dimensional trace analysis
conference, January 2011

Parallel I/O performance: From events to ensembles
conference, April 2010

Terascale direct numerical simulations of turbulent combustion using S3D
journal, January 2009

VAXcluster: a closely-coupled distributed system
journal, May 1986

Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark
conference, November 2008

EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization
conference, September 2011

Understanding and Improving Computational Science Storage Access through Continuous Characterization
journal, October 2011

Machine Learning Predictions of Runtime and IO Traffic on High-End Clusters
conference, September 2016

A non-staggered, conservative, , finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries
journal, November 2004

Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems
conference, November 2011

Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS
journal, August 2013

Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction
conference, November 2014

Spontaneous rotation sources in a quiescent tokamak edge plasma
journal, June 2008

24/7 Characterization of petascale I/O workloads
conference, August 2009

Managing Variability in the IO Performance of Petascale Storage Systems
conference, November 2010

A Multiplatform Study of I/O Behavior on Petascale Supercomputers
conference, January 2015

Adaptable, metadata rich IO methods for portable high performance IO
conference, May 2009

Predicting Output Performance of a Petascale Supercomputer
conference, January 2017