Nonuniformly communicating noncontiguous data: a case study with PETSc and MPI.
Abstract
Due to the complexity associated with developing parallel applications, scientists and engineers rely on high-level software libraries such as PETSc, ScaLAPACK and PESSL to ease this task. Such libraries assist developers by providing abstractions for mathematical operations, data representation and management of parallel layouts of the data, while internally using communication libraries such as MPI and PVM. With high-level libraries managing data layout and communication internally, it can be expected that they organize application data suitably for performing the library operations optimally. However, this places additional overhead on the underlying communication library by making the data layout noncontiguous in memory and communication volumes (data transferred by a process to each of the other processes) nonuniform. In this paper, we analyze the overheads associated with these two aspects (noncontiguous data layouts and nonuniform communication volumes) in the context of the PETSc software toolkit over the MPI communication library. We describe the issues with the current approaches used by MPICH2 (an implementation of MPI), propose different approaches to handle these issues and evaluate these approaches with micro-benchmarks as well as an application over the PETSc software library. Our experimental results demonstrate close to an order of magnitude improvement in the performance ofmore »
- Authors:
- Publication Date:
- Research Org.:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 971458
- Report Number(s):
- ANL/MCS/CP-58614
TRN: US201004%%18
- DOE Contract Number:
- DE-AC02-06CH11357
- Resource Type:
- Conference
- Resource Relation:
- Conference: 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007).; Mar. 26, 2007 - Mar. 30, 2007; Long Beach, CA
- Country of Publication:
- United States
- Language:
- ENGLISH
- Subject:
- 97 MATHEMATICAL METHODS AND COMPUTING; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; DATA TRANSMISSION; PARALLEL PROCESSING; MEMORY MANAGEMENT; EVALUATION; PERFORMANCE
Citation Formats
Balaji, P., Buntinas, D., Balay, S., Smith, B., Thakur, R., Gropp, W., and Mathematics and Computer Science. Nonuniformly communicating noncontiguous data: a case study with PETSc and MPI.. United States: N. p., 2007.
Web. doi:10.1109/IPDPS.2007.370223.
Balaji, P., Buntinas, D., Balay, S., Smith, B., Thakur, R., Gropp, W., & Mathematics and Computer Science. Nonuniformly communicating noncontiguous data: a case study with PETSc and MPI.. United States. doi:10.1109/IPDPS.2007.370223.
Balaji, P., Buntinas, D., Balay, S., Smith, B., Thakur, R., Gropp, W., and Mathematics and Computer Science. Mon .
"Nonuniformly communicating noncontiguous data: a case study with PETSc and MPI.". United States.
doi:10.1109/IPDPS.2007.370223.
@article{osti_971458,
title = {Nonuniformly communicating noncontiguous data: a case study with PETSc and MPI.},
author = {Balaji, P. and Buntinas, D. and Balay, S. and Smith, B. and Thakur, R. and Gropp, W. and Mathematics and Computer Science},
abstractNote = {Due to the complexity associated with developing parallel applications, scientists and engineers rely on high-level software libraries such as PETSc, ScaLAPACK and PESSL to ease this task. Such libraries assist developers by providing abstractions for mathematical operations, data representation and management of parallel layouts of the data, while internally using communication libraries such as MPI and PVM. With high-level libraries managing data layout and communication internally, it can be expected that they organize application data suitably for performing the library operations optimally. However, this places additional overhead on the underlying communication library by making the data layout noncontiguous in memory and communication volumes (data transferred by a process to each of the other processes) nonuniform. In this paper, we analyze the overheads associated with these two aspects (noncontiguous data layouts and nonuniform communication volumes) in the context of the PETSc software toolkit over the MPI communication library. We describe the issues with the current approaches used by MPICH2 (an implementation of MPI), propose different approaches to handle these issues and evaluate these approaches with micro-benchmarks as well as an application over the PETSc software library. Our experimental results demonstrate close to an order of magnitude improvement in the performance of a 3-D Laplacian multi-grid solver application when evaluated on a 128 processor cluster.},
doi = {10.1109/IPDPS.2007.370223},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}
-
This report documents a field study of 78 small commercial customers in the Sacramento Municipal Utility District service territory who volunteered for an integrated energy-efficiency/demand-response (EE-DR) program in the summer of 2008. The original objective for the pilot was to provide a better understanding of demand response issues in the small commercial sector. Early findings justified a focus on offering small businesses (1) help with the energy efficiency of their buildings in exchange for occasional load shed, and (2) a portfolio of options to meet the needs of a diverse customer sector. To meet these expressed needs, the research pilotmore »
-
A case study of MPI: Portable and efficient libraries
In this paper, we discuss the performance achieved by several implementations of the recently defined Message Passing Interface (MPI) standard. In particular, performance results for different implementations of the broadcast operation are analyzed and compared on the Delta, Paragon, SP1 and CM5. -
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies
We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scalingmore » -
MPI as a coordination layer for communicating HPF tasks
Data-parallel languages such as High Performance Fortran (HPF) present a simple execution model in which a single thread of control performs high-level operations on distributed arrays. These languages can greatly ease the development of parallel programs. Yet there are large classes of applications for which a mixture of task and data parallelism is most appropriate. Such applications can be structured as collections of data-parallel tasks that communicate by using explicit message passing. Because the Message Passing Interface (MPI) defines standardized, familiar mechanisms for this communication model, the authors propose that HPF tasks communicate by making calls to a coordination librarymore » -
The European experience on developing and communicating worst-case scenarios for accidental releases of hazardous materials
Europe has a large chemical industry and a high population density. The differing legislative contexts of risk analysis in the countries of Europe provide a rich source for comparing different approaches. In this paper, the authors draw on the European experience to consider the advantages and disadvantages of a worst case scenario approach. The escape of the total contents of a large storage vessel of hazardous materials can cause effects over a large distance. Typically, a large flammable release could cause deaths over a distance of hundreds of meters and damage over a number of kilometers. For toxic releases themore »