skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Visual Analytics Framework for the Detection of Anomalous Call Stack Trees in High Performance Computing Applications

Journal Article · · IEEE Transactions on Visualization and Computer Graphics
 [1];  [2];  [1]
  1. Stony Brook Univ., Stony Brook, NY (United States)
  2. Brookhaven National Lab. (BNL), Upton, NY (United States)

Anomalous runtime behavior detection is one of the most important tasks for performance diagnosis in High Performance Computing (HPC). Most of the existing methods find anomalous executions based on the properties of individual functions, such as execution time. However, it is insufficient to identify abnormal behavior without taking into account the context of the executions, such as the invocations of children functions and the communications with other HPC nodes. We improve upon the existing anomaly detection approaches by utilizing the call stack structures of the executions, which record rich temporal and contextual information. With our call stack tree (CSTree) representation of the executions, we formulate the anomaly detection problem as finding anomalous tree structures in a call stack forest. The CSTrees are converted to vector representations using our proposed stack2vec embedding. Structural and temporal visualizations of CSTrees are provided to support users in the identification and verification of the anomalies during an active anomaly detection process. Furthermore, three case studies of real-world HPC applications demonstrate the capabilities of our approach.

Research Organization:
Brookhaven National Lab. (BNL), Upton, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Workforce Development for Teachers and Scientists (WDTS)
Grant/Contract Number:
SC0012704
OSTI ID:
1489354
Report Number(s):
BNL-210830-2018-JAAM
Journal Information:
IEEE Transactions on Visualization and Computer Graphics, Vol. 25, Issue 1; Conference: IEEE TVCG presented as an oral presentation in IEEE Vis Conference; ISSN 1077-2626
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 22 works
Citation information provided by
Web of Science

Cited By (1)

TimeCluster: dimension reduction applied to temporal data for visual analytics journal May 2019

Similar Records

Exploratory Visual Analysis of Anomalous Runtime Behavior in Streaming High Performance Computing Applications
Journal Article · Sat Jun 08 00:00:00 EDT 2019 · Lecture Notes in Computer Science · OSTI ID:1489354

Scalable Comparative Visualization of Ensembles of Call Graphs
Journal Article · Fri Nov 19 00:00:00 EST 2021 · IEEE Transactions on Visualization and Computer Graphics · OSTI ID:1489354

UPC++
Software · Thu May 01 00:00:00 EDT 2014 · OSTI ID:1489354