skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations

Journal Article · · International Journal of Parallel Programming
 [1];  [2];  [3];  [2];  [2];  [4]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
  2. Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States). Dept. of Computer Science
  3. Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States). Dept. of Computer Science and Mathematics, Aerospace and Ocean Engineering
  4. Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States). Dept. of Electrical and Computer Engineering

Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model are consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25 %, and the total processor idle time by 85 %.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1336578
Journal Information:
International Journal of Parallel Programming, Vol. 43, Issue 4; ISSN 0885-7458
Publisher:
Springer
Country of Publication:
United States
Language:
English