Comprehensive Measurement and Analysis of the User-Perceived I/O Performance in a Production Leadership-Class Storage System
- ORNL
With the increase of the scale and intensity of the parallel I/O workloads generated by those scientific applications running on high performance computing facilities, understanding the I/O dynamics, especially the root cause of the I/O performance variability and degradation in HPC environment, have become extremely critical to the HPC community. In this paper, we run extensive I/O measuring tests on a production leadership-class storage system to capture the performance variabilities of large-scale parallel I/O. Analyzing these results and its statistic correlation revealed some valuable insights into the characteristics of the storage system and the root cause of I/O performance variability. Further, we leverage these findings and propose an I/O middleware design refactoring which can improve the performance of the parallel I/O by optimizing the data striping and placement. Our preliminary evaluation results demonstrate the proposed approach can reduce the average per-process write latency by at least 80% and the maximum per-process write latency by at least 20%.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1474694
- Country of Publication:
- United States
- Language:
- English
Characterizing output bottlenecks in a supercomputer
|
conference | November 2012 |
Comparative I/O workload characterization of two leadership class storage clusters
|
conference | January 2015 |
A multi-level approach for understanding I/O activity in HPC applications
|
conference | September 2013 |
24/7 Characterization of petascale I/O workloads
|
conference | August 2009 |
Managing Variability in the IO Performance of Petascale Storage Systems
|
conference | November 2010 |
AN OVERVIEW OF THE OMNeT++ SIMULATION ENVIRONMENT
|
conference | January 2008 |
I/O performance challenges at leadership scale
|
conference | January 2009 |
Modeling a Leadership-Scale Storage System
|
book | January 2012 |
The Gemini System Interconnect
|
conference | August 2010 |
A Multiplatform Study of I/O Behavior on Petascale Supercomputers
|
conference | January 2015 |
New techniques for simulating high performance MPI applications on large storage networks
|
journal | March 2009 |
Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations
|
journal | June 2016 |
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems
|
conference | November 2014 |
Heavy-tailed distribution of parallel I/O system response time
|
conference | January 2015 |
Similar Records
Workload Characterization of a Leadership Class Storage Cluster
An Optimizing Compiler for Petascale I/O on Leadership-Class Architectures
An Optimizing Compiler for Petascale I/O on Leadership Class Architectures
Conference
·
2009
·
OSTI ID:993463
An Optimizing Compiler for Petascale I/O on Leadership-Class Architectures
Technical Report
·
2014
·
OSTI ID:1123486
An Optimizing Compiler for Petascale I/O on Leadership Class Architectures
Technical Report
·
2015
·
OSTI ID:1172903