skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: MPISH : a parallel shell for MPI programs.

Abstract

While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usability for system software and management tasks. We introduce MPISH, a parallel shell designed to address these issues.

Authors:
; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
935908
Report Number(s):
ANL/MCS/CP-115256
TRN: US200816%%1044
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: International Parallel and Distributed Processing Symposium (IPDPS 2005); Apr. 3, 2005 - Apr. 8, 2005; Denver, CO
Country of Publication:
United States
Language:
ENGLISH
Subject:
97; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; M CODES; DATA TRANSMISSION; MEMORY MANAGEMENT; IMPLEMENTATION

Citation Formats

Desai, M., Lusk, A., Bradshaw, R., and Lusk, E.. MPISH : a parallel shell for MPI programs.. United States: N. p., 2006. Web. doi:10.1109/IPDPS.2005.206.
Desai, M., Lusk, A., Bradshaw, R., & Lusk, E.. MPISH : a parallel shell for MPI programs.. United States. doi:10.1109/IPDPS.2005.206.
Desai, M., Lusk, A., Bradshaw, R., and Lusk, E.. Sun . "MPISH : a parallel shell for MPI programs.". United States. doi:10.1109/IPDPS.2005.206.
@article{osti_935908,
title = {MPISH : a parallel shell for MPI programs.},
author = {Desai, M. and Lusk, A. and Bradshaw, R. and Lusk, E.},
abstractNote = {While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usability for system software and management tasks. We introduce MPISH, a parallel shell designed to address these issues.},
doi = {10.1109/IPDPS.2005.206},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Jan 01 00:00:00 EST 2006},
month = {Sun Jan 01 00:00:00 EST 2006}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a shared global address space that spans the memories of multiple compute nodes. However, taking advantage of UPC can require a large recoding effort for existing parallel applications. In this paper, we explore a new hybrid parallel programming modelmore » that combines MPI and UPC. This model allows MPI programmers incremental access to a greater amount of memory, enabling memory-constrained MPI codes to process larger data sets. In addition, the hybrid model offers UPC programmers an opportunity to create static UPC groups that are connected over MPI. As we demonstrate, the use of such groups can significantly improve the scalability of locality-constrained UPC codes. This paper presents a detailed description of the hybrid model and demonstrates its effectiveness in two applications: a random access benchmark and the Barnes-Hut cosmological simulation. Experimental results indicate that the hybrid model can greatly enhance performance; using hybrid UPC groups that span two cluster nodes, RA performance increases by a factor of 1.33 and using groups that span four cluster nodes, Barnes-Hut experiences a twofold speedup at the expense of a 2% increase in code size.« less
  • Formal program verification often requires creating a model of the program and running it through a model-checking tool. However, this model-creation step is itself error prone, tedious, and difficult for someone not familiar with formal verification. In this paper, we describe a tool for verifying correctness of MPI programs that does not require the creation of a model and instead works directly on the MPI program. Our tool uses the MPI profiling interface, PMPI, to trap MPI calls and hand over control of the MPI function execution to a scheduler. The scheduler verifies correctness of the program by executing allmore » 'relevant' interleavings of the program. The scheduler records an initial trace and replays its interleaving variants by using dynamic partial-order reduction. We describe the design and implementation of the tool and compare it with our previous work based on model checking.« less
  • is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by is being expanded to support a wider set of applications, andmore » MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, sik. In the largest runs, has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.« less