Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load

Conference ·

Scientific computing workloads at HPC facilities have been shifting from traditional numerical simulations to AI/ML applications for training and inference while processing and producing ever-increasing amounts of scientific data. To address the growing need for increased storage capacity, lower access latency, and higher bandwidth, emerging technologies such as non-volatile memory are integrated into supercomputer I/O subsystems. With these emerging trends, we need a better understanding of the multilayer supercomputer I/O systems and ways to use these subsystems efficiently. In this work, we study the I/O access patterns and performance characteristics of two representative supercomputer I/O subsystems. Through an extensive analysis of year-long I/O logs on each system, we report new observations in I/O reads and writes, unbalanced use of storage system layers, and new trends in user behaviors at the HPC I/O middleware stack.

Research Organization:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1959026
Resource Relation:
Conference: HPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, Minneapolis, MN (United States), 27 Jun - 1 Jul 2022
Country of Publication:
United States
Language:
English

References (17)

Systematically inferring I/O performance variability by examining repetitive job behavior
  • Costa, Emily; Patel, Tirthak; Schwaller, Benjamin
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3458817.3476186
conference November 2021
Diving into petascale production file systems through large scale profiling and analysis
  • Wang, Feiyi; Sim, Hyogi; Harr, Cameron
  • Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems - PDSW-DISCS '17 https://doi.org/10.1145/3149393.3149399
conference January 2017
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533
conference January 2008
SCTuner: An Autotuner Addressing Dynamic I/O Needs on Supercomputer I/O Subsystems conference November 2021
Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems conference November 2019
Revisiting I/O behavior in large-scale storage systems
  • Patel, Tirthak; Byna, Suren; Lockwood, Glenn K.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356183
conference November 2019
Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load conference May 2021
I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis conference November 2021
End-to-end I/O portfolio for the summit supercomputing ecosystem
  • Oral, Sarp; Vazhkudai, Sudharshan S.; Wang, Feiyi
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356157
conference November 2019
Parallel netCDF: A High-Performance Scientific I/O Interface conference January 2003
Tuning Parallel Data Compression and I/O for Large-scale Earthquake Simulation conference December 2021
Understanding I/O Bottlenecks and Tuning for High Performance I/O on Large HPC Systems conference July 2018
Write amplification analysis in flash-based solid state drives conference January 2009
A Multiplatform Study of I/O Behavior on Petascale Supercomputers
  • Luu, Huong; Winslett, Marianne; Gropp, William
  • Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '15 https://doi.org/10.1145/2749246.2749269
conference January 2015
Understanding and improving computational science storage access through continuous characterization conference May 2011
Towards HPC I/O Performance Prediction through Large-scale Log Analysis conference June 2020
Predicting Output Performance of a Petascale Supercomputer
  • Xie, Bing; Huang, Yezhou; Chase, Jeffrey S.
  • Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '17 https://doi.org/10.1145/3078597.3078614
conference January 2017