skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems

Abstract

Inter-application I/O contention and performance interference have been recognized as severe problems. In this work, we demonstrate, through measurement from Titan (world s No. 3 supercomputer), that high I/O variance co-exists with the fact that individual storage units remain under-utilized for the majority of the time. This motivates us to propose AID, a system that performs automatic application I/O characterization and I/O-aware job scheduling. AID analyzes existing I/O traffic and batch job history logs, without any prior knowledge on applications or user/developer involvement. It identifies the small set of I/O-intensive candidates among all applications running on a supercomputer and subsequently mines their I/O patterns, using more detailed per-I/O-node traffic logs. Based on such auto- extracted information, AID provides online I/O-aware scheduling recommendations to steer I/O-intensive applications away from heavy ongoing I/O activities. We evaluate AID on Titan, using both real applications (with extracted I/O patterns validated by contacting users) and our own pseudo-applications. Our results confirm that AID is able to (1) identify I/O-intensive applications and their detailed I/O characteristics, and (2) significantly reduce these applications I/O performance degradation/variance by jointly evaluating out- standing applications I/O pattern and real-time system l/O load.

Authors:
 [1];  [2];  [3];  [2]
  1. North Carolina State University (NCSU), Raleigh
  2. ORNL
  3. Qatar Computing Research Institute, Hamad Bin Khalifa University
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE
OSTI Identifier:
1338541
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, Salt Lake City, UT, USA, 20161113, 20161113
Country of Publication:
United States
Language:
English

Citation Formats

Liu, Y., Gunasekaran, Raghul, Ma, Xiaosong, and Vazhkudai, Sudharshan S. Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems. United States: N. p., 2016. Web.
Liu, Y., Gunasekaran, Raghul, Ma, Xiaosong, & Vazhkudai, Sudharshan S. Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems. United States.
Liu, Y., Gunasekaran, Raghul, Ma, Xiaosong, and Vazhkudai, Sudharshan S. Fri . "Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems". United States.
@article{osti_1338541,
title = {Server-side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems},
author = {Liu, Y. and Gunasekaran, Raghul and Ma, Xiaosong and Vazhkudai, Sudharshan S},
abstractNote = {Inter-application I/O contention and performance interference have been recognized as severe problems. In this work, we demonstrate, through measurement from Titan (world s No. 3 supercomputer), that high I/O variance co-exists with the fact that individual storage units remain under-utilized for the majority of the time. This motivates us to propose AID, a system that performs automatic application I/O characterization and I/O-aware job scheduling. AID analyzes existing I/O traffic and batch job history logs, without any prior knowledge on applications or user/developer involvement. It identifies the small set of I/O-intensive candidates among all applications running on a supercomputer and subsequently mines their I/O patterns, using more detailed per-I/O-node traffic logs. Based on such auto- extracted information, AID provides online I/O-aware scheduling recommendations to steer I/O-intensive applications away from heavy ongoing I/O activities. We evaluate AID on Titan, using both real applications (with extracted I/O patterns validated by contacting users) and our own pseudo-applications. Our results confirm that AID is able to (1) identify I/O-intensive applications and their detailed I/O characteristics, and (2) significantly reduce these applications I/O performance degradation/variance by jointly evaluating out- standing applications I/O pattern and real-time system l/O load.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2016},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: