skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Conceptual Framework for HPC Operational Data Analytics

Abstract

This paper provides a broad framework for under- standing trends in Operational Data Analytics (ODA) for High- Performance Computing (HPC) facilities. The goal of ODA is to allow for the continuous monitoring, archiving, and analysis of near real-time performance data, providing immediately actionable information for multiple operational uses. In this work, we combine two models to provide a comprehensive HPC ODA framework: one is an evolutionary model of analytics capabilities that consists of four types, which are descriptive, diagnostic, predictive and prescriptive, while the other is a four- pillar model for energy-efficient HPC operations that covers facility, system hardware, system software, and applications. This new framework is then overlaid with a description of current development and production deployments of ODA within leading- edge HPC facilities. Finally, we perform a comprehensive survey of ODA works and classify them according to our framework, in order to demonstrate its effectiveness.

Authors:
 [1]; ORCiD logo [2];  [3];  [4];  [5]
  1. LRZ
  2. ORNL
  3. Leibniz Supercomputing Centre
  4. Hewlett Packard Enterprise
  5. Energy Efficient HPC Working Group
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1820791
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: Energy Efficient HPC State of the Practice Workshop 2021 (CLUSTER 2021) - Portland, Oregon, United States of America - 9/7/2021 12:00:00 PM-9/7/2021 12:00:00 PM
Country of Publication:
United States
Language:
English

Citation Formats

Netti, Alessio, Shin, Woong, Ott, Michael, Wilde, Torsten, and Bates, Natalie. A Conceptual Framework for HPC Operational Data Analytics. United States: N. p., 2021. Web.
Netti, Alessio, Shin, Woong, Ott, Michael, Wilde, Torsten, & Bates, Natalie. A Conceptual Framework for HPC Operational Data Analytics. United States.
Netti, Alessio, Shin, Woong, Ott, Michael, Wilde, Torsten, and Bates, Natalie. 2021. "A Conceptual Framework for HPC Operational Data Analytics". United States. https://www.osti.gov/servlets/purl/1820791.
@article{osti_1820791,
title = {A Conceptual Framework for HPC Operational Data Analytics},
author = {Netti, Alessio and Shin, Woong and Ott, Michael and Wilde, Torsten and Bates, Natalie},
abstractNote = {This paper provides a broad framework for under- standing trends in Operational Data Analytics (ODA) for High- Performance Computing (HPC) facilities. The goal of ODA is to allow for the continuous monitoring, archiving, and analysis of near real-time performance data, providing immediately actionable information for multiple operational uses. In this work, we combine two models to provide a comprehensive HPC ODA framework: one is an evolutionary model of analytics capabilities that consists of four types, which are descriptive, diagnostic, predictive and prescriptive, while the other is a four- pillar model for energy-efficient HPC operations that covers facility, system hardware, system software, and applications. This new framework is then overlaid with a description of current development and production deployments of ODA within leading- edge HPC facilities. Finally, we perform a comprehensive survey of ODA works and classify them according to our framework, in order to demonstrate its effectiveness.},
doi = {},
url = {https://www.osti.gov/biblio/1820791}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2021},
month = {9}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: