skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Self-consistent MPI performance requirements.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
973470
Report Number(s):
ANL/MCS/CP-59733
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Journal Name: Lec. Notes Comput. Sci.; Journal Volume: 4757; Journal Issue: 2007; Conference: EuroPVM/MPI 2007; Sep. 30, 2007 - Oct. 3, 2007; Paris, France
Country of Publication:
United States
Language:
ENGLISH

Citation Formats

Traff, J. L., Gropp, W., Thakur, R., Mathematics and Computer Science, and C&C Research Labs. Self-consistent MPI performance requirements.. United States: N. p., 2007. Web. doi:10.1007/978-3-540-75416-9_12.
Traff, J. L., Gropp, W., Thakur, R., Mathematics and Computer Science, & C&C Research Labs. Self-consistent MPI performance requirements.. United States. doi:10.1007/978-3-540-75416-9_12.
Traff, J. L., Gropp, W., Thakur, R., Mathematics and Computer Science, and C&C Research Labs. Mon . "Self-consistent MPI performance requirements.". United States. doi:10.1007/978-3-540-75416-9_12.
@article{osti_973470,
title = {Self-consistent MPI performance requirements.},
author = {Traff, J. L. and Gropp, W. and Thakur, R. and Mathematics and Computer Science and C&C Research Labs},
abstractNote = {},
doi = {10.1007/978-3-540-75416-9_12},
journal = {Lec. Notes Comput. Sci.},
number = 2007,
volume = 4757,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • We recently introduced the idea of self-consistent performance requirements for MPI communication. Such requirements provide a means to ensure consistent behavior of an MPI library, thereby ensuring a degree of performance portability by making it unnecessary for a user to perform implementation-dependent optimizations by hand. For the collective operations in particular, a large number of such rules could sensibly be formulated, without making hidden assumptions about the underlying communication system or otherwise constraining the MPI implementation. In this paper, we extend this idea to the realm of parallel I/O (MPI-IO), where the issues are far more subtle. In particular, itmore » is not always possible to specify performance requirements without making assumptions about the implementation or without a priori knowledge of the I/O access pattern. For such cases, we introduce the notion of performance expectations, which specify the desired behavior for good implementations of MPI-IO. I/O performance requirements as well as expectations could be automatically checked by an appropriate benchmarking tool.« less
  • Previous studies demonstrated that Ethemet local area network traffic is statistically self-similar and that the commonly used Poisson models are not able to capture the fractal characteristics of Ethemet traffic. This contribution uses simulated self-similar traffic traces from the MITRE Corporation and Sandia`s simulation software to evaluate the ABR performance of an ATM backbone. The ATM backbone interconnects Ethemet LANs via edge devices such as routers and bridges. We evaluate the overall network performance in terms of throughput, response time, fairness, and buffer requirements. Because typical edge devices perform simple forwarding functions, their usual mechanism for signaling network congestion ismore » packet dropping. Therefore, we believe that the proper provisioning of buffer resources in ATM edge devices is crucial to the overall network performance.« less
  • In this paper we describe the difficulties inherent in making accurate, reproducible measurements of message-passing performance. We describe some of the mistakes often made in attempting such measurements and the consequences of such mistakes. We describe mpptest, a suite of performance measurement programs developed at Argonne National Laboratory, that attempts to avoid such mistakes and obtain reproducible measures of MPI performance that can be useful to both MPI implementers and MPI application writers. We include a number of illustrative examples of its use.
  • The relatively low-performance of a petroleum reservoir model when it executes on a single workstation is one of the key motivating factors for exploiting high-performance computing on workstation clusters. Workstation clusters, connected through a Local Area Network, are at a stage where their effectiveness as a suitable configuration for high-performance parallel processing has already been established. This paper discusses the improvement in performance of an engineering application on a workstation cluster using the MPI (Message Passing Interface) software environment. The importance of this approach for many engineering and scientific applications is illustrated by the case study, which also provides amore » recommended porting methodology for similar applications.« less
  • A data parallel version of the 3-D transport solver in DANTSYS has been in use on the SIMD CM-200`s at LANL since 1994. This version typically obtains grind times of 150--200 nanoseconds on a 2,048 PE CM-200. The authors have now implemented a new message passing parallel version of DANTSYS, referred to as DANTSYS/MPI, on the 512 PE Cray T3D at Los Alamos. By taking advantage of the SPMD architecture of the Cray T3D, as well as its low latency communications network, they have managed to achieve grind times of less than 10 nanoseconds on real problems. DANTSYS/MPI is fullymore » accelerated using DSA on both the inner and outer iterations. This paper describes the implementation of DANTSYS/MPI on the Cray T3D, and presents two simple performance models for the transport sweep which accurately predict the grind time as a function of the number of PE`s and problem size, or scalability.« less