Fast Optimal Load Balancing Algorithms for 1D Partitioning
One-dimensional decomposition of nonuniform workload arrays for optimal load balancing is investigated. The problem has been studied in the literature as ''chains-on-chains partitioning'' problem. Despite extensive research efforts, heuristics are still used in parallel computing community with the ''hope'' of good decompositions and the ''myth'' of exact algorithms being hard to implement and not runtime efficient. The main objective of this paper is to show that using exact algorithms instead of heuristics yields significant load balance improvements with negligible increase in preprocessing time. We provide detailed pseudocodes of our algorithms so that our results can be easily reproduced. We start with a review of literature on chains-on-chains partitioning problem. We propose improvements on these algorithms as well as efficient implementation tips. We also introduce novel algorithms, which are asymptotically and runtime efficient. We experimented with data sets from two different applications: Sparse matrix computations and Direct volume rendering. Experiments showed that the proposed algorithms are 100 times faster than a single sparse-matrix vector multiplication for 64-way decompositions on average. Experiments also verify that load balance can be significantly improved by using exact algorithms instead of heuristics. These two findings show that exact algorithms with efficient implementations discussed in this paper can effectively replace heuristics.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE; Scientific and Technical Research Council of Turkey Grant EEEAG-199E001 (US)
- DOE Contract Number:
- AC03-76SF00098
- OSTI ID:
- 835143
- Report Number(s):
- LBNL-51862; R&D Project: 365968; TRN: US200435%%85
- Journal Information:
- Journal of Parallel and Distributed Computing, Vol. 64, Issue 8; Other Information: Submitted to Journal of Parallel and Distributed Computing, Volume 64, No.8; Journal Publication Date: 08/2004; PBD: 9 Dec 2002
- Country of Publication:
- United States
- Language:
- English
Similar Records
A Novel Coarsening Method for Scalable and Efficient Mesh Generation
Tensor Contraction and Operation Minimization forExtreme Scale Computational Chemistry
Related Subjects
ALGORITHMS
IMPLEMENTATION
VECTORS
LOAD BALANCING ONE-DIMENSIONAL PARTITIONING CHAINS-ON-CHAINS PARTITIONING DYNAMIC PROGRAMMING ITERATIVE REFINEMENT PARAMETRIC SEARCH PARALLEL SPARSE MATRIX VECTOR MULTIPLICATION IMAGE-SPACE PARALLEL VOLUME RENDERING