Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load
Scientific computing workloads at HPC facilities have been shifting from traditional numerical simulations to AI/ML applications for training and inference while processing and producing ever-increasing amounts of scientific data. To address the growing need for increased storage capacity, lower access latency, and higher bandwidth, emerging technologies such as non-volatile memory are integrated into supercomputer I/O subsystems. With these emerging trends, we need a better understanding of the multilayer supercomputer I/O systems and ways to use these subsystems efficiently. In this work, we study the I/O access patterns and performance characteristics of two representative supercomputer I/O subsystems. Through an extensive analysis of year-long I/O logs on each system, we report new observations in I/O reads and writes, unbalanced use of storage system layers, and new trends in user behaviors at the HPC I/O middleware stack.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1959026
- Resource Relation:
- Conference: HPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, Minneapolis, MN (United States), 27 Jun - 1 Jul 2022
- Country of Publication:
- United States
- Language:
- English
Systematically inferring I/O performance variability by examining repetitive job behavior
|
conference | November 2021 |
Diving into petascale production file systems through large scale profiling and analysis
|
conference | January 2017 |
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
|
conference | January 2008 |
SCTuner: An Autotuner Addressing Dynamic I/O Needs on Supercomputer I/O Subsystems
|
conference | November 2021 |
Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems
|
conference | November 2019 |
Revisiting I/O behavior in large-scale storage systems
|
conference | November 2019 |
Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load
|
conference | May 2021 |
I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis
|
conference | November 2021 |
End-to-end I/O portfolio for the summit supercomputing ecosystem
|
conference | November 2019 |
Parallel netCDF: A High-Performance Scientific I/O Interface
|
conference | January 2003 |
Tuning Parallel Data Compression and I/O for Large-scale Earthquake Simulation
|
conference | December 2021 |
Understanding I/O Bottlenecks and Tuning for High Performance I/O on Large HPC Systems
|
conference | July 2018 |
Write amplification analysis in flash-based solid state drives
|
conference | January 2009 |
A Multiplatform Study of I/O Behavior on Petascale Supercomputers
|
conference | January 2015 |
Understanding and improving computational science storage access through continuous characterization
|
conference | May 2011 |
Towards HPC I/O Performance Prediction through Large-scale Log Analysis
|
conference | June 2020 |
Predicting Output Performance of a Petascale Supercomputer
|
conference | January 2017 |
Similar Records
...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers