Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Spatio-Temporal Analysis of HPC I/O and Connection Data

Conference ·

The HPC system consists of a set of layers of software and hardware for I/O and networking. System logs are helpful resources to understand what is going on in the system. A challenge is that it is non-trivial to analyze the logs maintained in various levels of the stack. Independent analysis might lead to an incomplete conclusion due to the limited coverage of each log. This work takes a comprehensive approach to analysis that incorporates the logs in the multiple layers and components, in order to facilitate the detection of anomalous activities. This research aims to identify and predict potential performance bottlenecks in the HPC system, by capturing the temporal variation patterns from heterogeneous, high-dimensional, and non-linear log data. In this paper, we share our preliminary efforts for spatio-temporal analysis of HPC I/O and connection data, with our initial observations from the analysis of one-week HPC log data sets collected from one of NERSC systems.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE Office of Science (SC)
OSTI ID:
1544245
Country of Publication:
United States
Language:
English

Similar Records

A model for optimizing file access patterns using spatio-temporal parallelism
Conference · Mon Dec 31 23:00:00 EST 2012 · OSTI ID:1154873

Design and implementation of I/O performance prediction scheme on HPC systems through large-scale log analysis
Journal Article · Wed May 17 00:00:00 EDT 2023 · Journal of Big Data · OSTI ID:1974087

Related Subjects