skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Conceptual Framework for HPC Operational Data Analytics

Conference ·

This paper provides a broad framework for under- standing trends in Operational Data Analytics (ODA) for High- Performance Computing (HPC) facilities. The goal of ODA is to allow for the continuous monitoring, archiving, and analysis of near real-time performance data, providing immediately actionable information for multiple operational uses. In this work, we combine two models to provide a comprehensive HPC ODA framework: one is an evolutionary model of analytics capabilities that consists of four types, which are descriptive, diagnostic, predictive and prescriptive, while the other is a four- pillar model for energy-efficient HPC operations that covers facility, system hardware, system software, and applications. This new framework is then overlaid with a description of current development and production deployments of ODA within leading- edge HPC facilities. Finally, we perform a comprehensive survey of ODA works and classify them according to our framework, in order to demonstrate its effectiveness.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1820791
Resource Relation:
Conference: Energy Efficient HPC State of the Practice Workshop 2021 (CLUSTER 2021) - Portland, Oregon, United States of America - 9/7/2021 12:00:00 PM-9/7/2021 12:00:00 PM
Country of Publication:
United States
Language:
English

Related Subjects