skip to main content

Title: Optimizing I/O Forwarding Techniques for Extreme-Scale Event Tracing

Programming development tools are a vital com- ponent for understanding the behavior of parallel applica- tions. Event tracing is a principal ingredient to these tools, but new and serious challenges place event tracing at risk on extreme-scale machines. As the quantity of captured events increases with concurrency, the additional data can over- load the parallel file system and perturb the application be- ing observed. In this work we present a solution for event tracing on extreme-scale machines. We enhance an I/O for- warding software layer to aggregate and reorganize log data prior to writing to the storage system, significantly reduc- ing the burden on the underlying file system. Furthermore, we introduce a sophisticated write buffering capability to limit the impact. To validate the approach, we employ the Vampir tracing toolset using these new capabilities. Our re- sults demonstrate that the approach increases the maximum traced application size by a factor of 5x to more than 200,000 processes.
 [1] ;  [1] ;  [2] ;  [2] ;  [3] ;  [1] ;  [2] ;  [2] ;  [1] ;  [3]
  1. Technische Universitat Dresden
  2. Argonne National Laboratory (ANL)
  3. ORNL
Publication Date:
OSTI Identifier:
DOE Contract Number:
Resource Type:
Journal Article
Resource Relation:
Journal Name: Cluster Computing: The Journal of Networks, Software Tools and Applications; Journal Volume: 17; Journal Issue: 1
Research Org:
Oak Ridge National Laboratory (ORNL); Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org:
SC USDOE - Office of Science (SC)
Country of Publication:
United States
event tracing; I/O forwarding; atomic append