Interpreting Write Performance of Supercomputer I/O Systems with Regression Models
- ORNL
- Carnegie Mellon University (CMU)
- Argonne National Laboratory (ANL)
- Duke University
- Sandia National Laboratories (SNL)
- Micron Technology Inc
This work seeks to advance the state of the art in HPC I/O performance analysis and interpretation. In particular, we demonstrate effective techniques to: (1) model output performance in the presence of I/O interference from production loads; (2) build features from write patterns and key parameters of the system architecture and configurations; (3) employ suitable machine learning algorithms to improve model accuracy. We train models with five popular regression algorithms and conduct experiments on two distinct production HPC platforms. We find that the lasso and random forest models predict output performance with high accuracy on both of the target systems. We also explore use of the models to guide adaptation in I/O middleware systems, and show potential for improvements of at least 15% from model-guided adaptation on 70% of samples, and improvements up to 10× on some samples for both of the target systems.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1809965
- Country of Publication:
- United States
- Language:
- English
Similar Records
Interpreting Write Performance of Supercomputer I/O Systems with Machine Intelligence
Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems
Characterizing output bottlenecks in a supercomputer
Conference
·
Thu Dec 31 23:00:00 EST 2020
·
OSTI ID:1863761
Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems
Conference
·
Fri Nov 01 00:00:00 EDT 2019
·
OSTI ID:1606822
Characterizing output bottlenecks in a supercomputer
Conference
·
Sat Dec 31 23:00:00 EST 2011
·
OSTI ID:1063838