skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Partitioning and Dynamic Load Balancing for Petascale Applications.


Abstract not provided.

; ; ;
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Resource Relation:
Conference: Proposed for presentation at the SciDAC07 Conference held June 25-29, 2007 in Boston, MA.
Country of Publication:
United States

Citation Formats

Devine, Karen Dragon, Boman, Erik G., Riesen, Lee Ann, and Catalyurek, Umit. Partitioning and Dynamic Load Balancing for Petascale Applications.. United States: N. p., 2007. Web.
Devine, Karen Dragon, Boman, Erik G., Riesen, Lee Ann, & Catalyurek, Umit. Partitioning and Dynamic Load Balancing for Petascale Applications.. United States.
Devine, Karen Dragon, Boman, Erik G., Riesen, Lee Ann, and Catalyurek, Umit. Fri . "Partitioning and Dynamic Load Balancing for Petascale Applications.". United States. doi:.
title = {Partitioning and Dynamic Load Balancing for Petascale Applications.},
author = {Devine, Karen Dragon and Boman, Erik G. and Riesen, Lee Ann and Catalyurek, Umit},
abstractNote = {Abstract not provided.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Jun 01 00:00:00 EDT 2007},
month = {Fri Jun 01 00:00:00 EDT 2007}

Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Diffusion Monte Carlo is the most accurate widely used Quantum Monte Carlo method for the electronic structure of materials, but it requires frequent load balancing or population redistribution steps to maintain efficiency and avoid accumulation of systematic errors on parallel machines. The load balancing step can be a significant factor affecting performance, and will become more important as the number of processing elements increases. We propose a new dynamic load balancing algorithm, the Alias Method, and evaluate it theoretically and empirically. An important feature of the new algorithm is that the load can be perfectly balanced with each process receivingmore » at most one message. It is also optimal in the maximum size of messages received by any process. We also optimize its implementation to reduce network contention, a process facilitated by the low messaging requirement of the algorithm. Empirical results on the petaflop Cray XT Jaguar supercomputer at ORNL showing up to 30% improvement in performance on 120,000 cores. The load balancing algorithm may be straightforwardly implemented in existing codes. The algorithm may also be employed by any method with many near identical computational tasks that requires load balancing.« less
  • Abstract not provided.
  • In this paper, we introduce the Dynamic Load-balanced Tensor Contractions (DLTC), a domain-specific library for efficient task parallel execution of tensor contraction expressions, a class of computation encountered in quantum chemistry and physics. Our framework decomposes each contraction into smaller unit of tasks, represented by an abstraction referred to as iterators. We exploit an extra level of parallelism by having tasks across independent contractions executed concurrently through a dynamic load balancing run- time. We demonstrate the improved performance, scalability, and flexibility for the computation of tensor contraction expressions on parallel computers using examples from coupled cluster methods.
  • Abstract not provided.
  • One class of scientific and engineering applications involves structured meshes. One example of a code in this class is a flame modelling code developed at the Naval Research Laboratory (NRL). The numerical model used in the NRL flame code is predominantly based on structured finite volume methods. The chemistry process of the reactive flow is modeled by a system of ordinary differential equations which is solved independently at each grid point. Thus, though the model uses a mesh structure, the workload at each grid point can vary considerably. It is this feature that requires the use of both structured andmore » unstructured methods in the same code. We have applied the Multiblock PARTI and CHAOS runtime support libraries to parallelize the NRL flame code with minimal changes to the sequential code. We have also developed parallel algorithms to carry out dynamic load balancing. It has been observed that the overall performance scales reasonably up to 256 Paragon processors and that the total runtime on a 256-node Paragon is about half that of a single processor Cray C90.« less