Performance analysis of distributed applications using automatic classification of communication inefficiencies
The method and system described herein presents a technique for performance analysis that helps users understand the communication behavior of their message passing applications. The method and system described herein may automatically classifies individual communication operations and reveal the cause of communication inefficiencies in the application. This classification allows the developer to quickly focus on the culprits of truly inefficient behavior, rather than manually foraging through massive amounts of performance data. Specifically, the method and system described herein trace the message operations of Message Passing Interface (MPI) applications and then classify each individual communication event using a supervised learning technique: decision tree classification. The decision tree may be trained using microbenchmarks that demonstrate both efficient and inefficient communication. Since the method and system described herein adapt to the target system's configuration through these microbenchmarks, they simultaneously automate the performance analysis process and improve classification accuracy. The method and system described herein may improve the accuracy of performance analysis and dramatically reduce the amount of data that users must encounter.
- Research Organization:
- The Regents of the Univ. of California, Oakland, CA (United States); Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-48
- Assignee:
- The Regents of the University of California (Oakland, CA)
- Patent Number(s):
- 6,850,920
- Application Number:
- 09/922,355
- OSTI ID:
- 1175231
- Country of Publication:
- United States
- Language:
- English
Timestamp consistency and trace-driven analysis for distributed parallel systems
|
conference | April 1995 |
Trace-based analysis and tuning for distributed parallel applications
|
conference | January 1994 |
A survey of decision tree classifier methodology
|
journal | January 1991 |
Similar Records
Analysis of multichannel internet communication.
Optimisation and Performance Evaluation of Mechanisms for Latency Tolerance in Remote Memory Access Communication on Clusters