An Integrated Performance Visualizer for MPI/OpenMP Programs

Hoeflinger, J; Kuhn, B; Petersen, P; Rajic, H; Shah, S; Vetter, J; Voss, M; Woo, R

doi:10.1007/3-540-44587-0_5

Title: An Integrated Performance Visualizer for MPI/OpenMP Programs

Conference · Sun Feb 25 00:00:00 EST 2001

DOI:https://doi.org/10.1007/3-540-44587-0_5· OSTI ID:15005336

Hoeflinger, J; Kuhn, B; Petersen, P; Rajic, H; Shah, S; Vetter, J; Voss, M; Woo, R

Cluster computing has emerged as a defacto standard in parallel computing over the last decade. Now, researchers have begun to use clustered, shared-memory multiprocessors (SMPs) to attack some of the largest and most complex scientific calculations in the world today [2, 1], running them on the world's largest machines including the US DOE ASCI platforms: Red, Blue Mountain, Blue Pacific, and White. MPI has been the predominant programming model for clusters [3]; however, as users move to ''wider'' SMPs, the combination of MPI and threads has a ''natural fit'' to the underlying system design: use MPI for managing parallelism between SMPs and threads for parallelism within one SMP. OpenMP is emerging as a leading contender for managing parallelism within an SMP. OpenMP and MPI offer their users very different characteristics. Developed for different memory models, they fill diametrically opposed needs for parallel programming. OpenMP was made for shared memory systems, while MPI was made for distributed memory systems. OpenMP was designed for explicit parallelism and implicit data movement, while MPI was designed for explicit data movement and implicit parallelism. This difference in focus gives the two parallel programming frameworks very different usage characteristics. But these complementary usage characteristics make the two frameworks perfect for handling the two different parallel environments presented by cluster computing: shared memory within a box and distributed memory between the boxes. Unfortunately, simply writing OpenMP and MPI code does not guarantee efficient use of the underlying cluster hardware. What is more, existing tools only provide performance information about either MPI or OpenMP, but not both. This lack of integration prevents users from understanding the critical path for performance in their application. This integration also helps users adjust their expectations of performance for their application's software design. Once the user decides to investigate their application's performance, they need detailed information about the expense of operations in their application. Most likely, message passing activity and OpenMP regions are related to these expensive operations. Viewed in this light, the user needs a performance analyzer to understand the complex interactions of MPI and OpenMP. For message passing codes, several performance analysis tools exist: Vampir, TimeScan, Paragraph, and others [make citations]. For OpenMP codes there is GuideView and a few other proprietary tools from other vendors [make citations]. However, in practice, there is little production quality support for the combination of MPI and OpenMP.

View Conference

Cite

Export

Save

Research Organization:: Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

Sponsoring Organization:: US Department of Energy (US)

DOE Contract Number:: W-7405-ENG-48

OSTI ID:: 15005336

Report Number(s):: UCRL-JC-142829; TRN: US200322%%378

Resource Relation:: Journal Volume: 2104; Conference: Workshop on OpenMP Applications and Tools, West Lafayette, IN (US), 07/30/2001--07/31/2001; Other Information: PBD: 25 Feb 2001

Country of Publication:: United States

Language:: English

References (4)

Waiting time analysis and performance visualization in Carnival Meira, Wagner; LeBlanc, Thomas J.; Poulos, Alexandros Proceedings of the SIGMETRICS symposium on Parallel and distributed tools - SPDT '96 https://doi.org/10.1145/238020.238023	conference	January 1996
Very high resolution simulation of compressible turbulence on the IBM-SP system Mirin, A. A.; Porter, D. H.; Woodward, P. R. Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99 https://doi.org/10.1145/331532.331601	conference	January 1999
Visualizing the performance of parallel programs Heath, M. T.; Etheridge, J. A. IEEE Software, Vol. 8, Issue 5 https://doi.org/10.1109/52.84214	journal	September 1991
High-Performance Reactive Fluid Flow Simulations Using Adaptive Mesh Refinement on Thousands of Processors Calder, A. C.; Curts, B. C.; Dursi, L. J. ACM/IEEE SC 2000 Conference (SC'00) https://doi.org/10.1109/SC.2000.10010	conference	January 2000

Similar Records

Compiled MPI: Cost-Effective Exascale Applications Development

Technical Report · Tue Apr 10 00:00:00 EDT 2012 · OSTI ID:15005336

Bronevetsky, G; Quinlan, D; Lumsdaine, A; +1 more

Parallel, Distributed Scripting with Python

Conference · Fri May 24 00:00:00 EDT 2002 · OSTI ID:15005336

Miller, P J

Experience with mixed MPI/threaded programming models

Conference · Thu Apr 01 00:00:00 EST 1999 · OSTI ID:15005336

May, J M; Supinski, B R

Related Subjects

99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
ARRAY PROCESSORS
DESIGN
PERFORMANCE
PRODUCTION
PROGRAMMING
US DOE

Title: An Integrated Performance Visualizer for MPI/OpenMP Programs

Citation Formats

References (4)

Similar Records

Related Subjects