skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fast Classification of MPI Applications Using Lamport Logical Clocks

Authors:
 [1];  [2];  [2]
  1. Florida State University
  2. Los Alamos National Laboratory
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
DOE/LANL
OSTI Identifier:
1148960
Report Number(s):
LA-UR-14-26028
DOE Contract Number:
AC52-06NA25396
Resource Type:
Conference
Resource Relation:
Conference: LANL Mini-showcase ; 2014-07-31 - 2014-07-31 ; Los Alamos, New Mexico, United States
Country of Publication:
United States
Language:
English
Subject:
Mathematics & Computing(97); Computer Science

Citation Formats

Tong, Zhou, Pakin, Scott D., and Lang, Michael Kenneth. Fast Classification of MPI Applications Using Lamport Logical Clocks. United States: N. p., 2014. Web.
Tong, Zhou, Pakin, Scott D., & Lang, Michael Kenneth. Fast Classification of MPI Applications Using Lamport Logical Clocks. United States.
Tong, Zhou, Pakin, Scott D., and Lang, Michael Kenneth. Thu . "Fast Classification of MPI Applications Using Lamport Logical Clocks". United States. doi:. https://www.osti.gov/servlets/purl/1148960.
@article{osti_1148960,
title = {Fast Classification of MPI Applications Using Lamport Logical Clocks},
author = {Tong, Zhou and Pakin, Scott D. and Lang, Michael Kenneth},
abstractNote = {},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jul 31 00:00:00 EDT 2014},
month = {Thu Jul 31 00:00:00 EDT 2014}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Typical large-scale scientific applications periodically write checkpoint files to save the computational state throughout execution. Existing parallel file systems improve such write-only I/O patterns through the use of client-side file caching and write-behind strategies. In distributed environments where files are rarely accessed by more than one client concurrently, file caching has achieved significant success; however, in parallel applications where multiple clients manipulate a shared file, cache coherence control can serialize I/O. We have designed a thread based caching layer for the MPI I/O library, which adds a portable caching system closer to user applications so more information about the application'smore » I/O patterns is available for better coherence control. We demonstrate the impact of our caching solution on parallel write performance with a comprehensive evaluation that includes a set of widely used I/O benchmarks and production application I/O kernels.« less
  • Typical large-scale scientific applications periodically write checkpoint files to save the computational state throughout execution. Existing parallel file systems improve such write-only I/O patterns through the use of clientside file caching and write-behind strategies. In distributed environments where files are rarely accessed by more than one client concurrently, file caching has achieved significant success; however, in parallel applications where multiple clients manipulate a shared file, cache coherence control can serialize I/O. We have designed a thread based caching layer for the MPI I/O library, which adds a portable caching system closer to user applications so more information about the application'smore » I/O patterns is available for better coherence control. We demonstrate the impact of our caching solution on parallel write performance with a comprehensive evaluation that includes a set of widely used I/O benchmarks and production application I/O kernels.« less
  • No abstract prepared.
  • We present a technique for performance analysis that helps users understand the communication behavior of their message passing applications. Our method automatically classifies individual communication operations and it reveals the cause of communication inefficiencies in the application. This classification allows the developer to focus quickly on the culprits of truly inefficient behavior, rather than manually foraging through massive amounts of performance data. Specifically, we trace the message operations of MPI applications and then classify each individual communication event using decision tree classification, a supervised learning technique. We train our decision tree using microbenchmarks that demonstrate both efficient and inefficient communication.more » Since our technique adapts to the target system's configuration through these microbenchmarks, we can simultaneously automate the performance analysis process and improve classification accuracy. Our experiments on four applications demonstrate that our technique can improve the accuracy of performance analysis, and dramatically reduce the amount of data that users must encounter.« less