An Integrated Performance Visualizer for MPI/OpenMP Programs
Cluster computing has emerged as a defacto standard in parallel computing over the last decade. Now, researchers have begun to use clustered, shared-memory multiprocessors (SMPs) to attack some of the largest and most complex scientific calculations in the world today [2, 1], running them on the world's largest machines including the US DOE ASCI platforms: Red, Blue Mountain, Blue Pacific, and White. MPI has been the predominant programming model for clusters [3]; however, as users move to ''wider'' SMPs, the combination of MPI and threads has a ''natural fit'' to the underlying system design: use MPI for managing parallelism between SMPs and threads for parallelism within one SMP. OpenMP is emerging as a leading contender for managing parallelism within an SMP. OpenMP and MPI offer their users very different characteristics. Developed for different memory models, they fill diametrically opposed needs for parallel programming. OpenMP was made for shared memory systems, while MPI was made for distributed memory systems. OpenMP was designed for explicit parallelism and implicit data movement, while MPI was designed for explicit data movement and implicit parallelism. This difference in focus gives the two parallel programming frameworks very different usage characteristics. But these complementary usage characteristics make the two frameworks perfect for handling the two different parallel environments presented by cluster computing: shared memory within a box and distributed memory between the boxes. Unfortunately, simply writing OpenMP and MPI code does not guarantee efficient use of the underlying cluster hardware. What is more, existing tools only provide performance information about either MPI or OpenMP, but not both. This lack of integration prevents users from understanding the critical path for performance in their application. This integration also helps users adjust their expectations of performance for their application's software design. Once the user decides to investigate their application's performance, they need detailed information about the expense of operations in their application. Most likely, message passing activity and OpenMP regions are related to these expensive operations. Viewed in this light, the user needs a performance analyzer to understand the complex interactions of MPI and OpenMP. For message passing codes, several performance analysis tools exist: Vampir, TimeScan, Paragraph, and others [make citations]. For OpenMP codes there is GuideView and a few other proprietary tools from other vendors [make citations]. However, in practice, there is little production quality support for the combination of MPI and OpenMP.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- US Department of Energy (US)
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 15005336
- Report Number(s):
- UCRL-JC-142829; TRN: US200322%%378
- Resource Relation:
- Journal Volume: 2104; Conference: Workshop on OpenMP Applications and Tools, West Lafayette, IN (US), 07/30/2001--07/31/2001; Other Information: PBD: 25 Feb 2001
- Country of Publication:
- United States
- Language:
- English
Waiting time analysis and performance visualization in Carnival
|
conference | January 1996 |
Very high resolution simulation of compressible turbulence on the IBM-SP system
|
conference | January 1999 |
Visualizing the performance of parallel programs
|
journal | September 1991 |
High-Performance Reactive Fluid Flow Simulations Using Adaptive Mesh Refinement on Thousands of Processors
|
conference | January 2000 |
Similar Records
Parallel, Distributed Scripting with Python
Experience with mixed MPI/threaded programming models