Dynamic Load Balancing on Single- and Multi-GPU Systems
The computational power provided by many-core graphics processing units (GPUs) has been exploited in many applications. The programming techniques supported and employed on these GPUs are not sufficient to address problems exhibiting irregular, and unbalanced workload. The problem is exacerbated when trying to effectively exploit multiple GPUs, which are commonly available in many modern systems. In this paper, we propose a task-based dynamic load-balancing solution for single- and multi-GPU systems. The solution allows load balancing at a finer granularity than what is supported in existing APIs such as NVIDIA’s CUDA. We evaluate our approach using both micro-benchmarks and a molecular dynamics application that exhibits significant load imbalance. Experimental results with a single-GPU configuration show that our fine-grained task solution can utilize the hardware more efficiently than the CUDA scheduler for unbalanced workload. On multi-GPU systems, our solution achieves near-linear speedup, load balance, and significant performance improvement over techniques based on standard CUDA APIs.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 986258
- Report Number(s):
- PNNL-SA-70333; TRN: US201017%%36
- Resource Relation:
- Conference: Proceedings of the 24th IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), 1-12
- Country of Publication:
- United States
- Language:
- English
Similar Records
Exploring Fine-Grained Task-based Execution on Multi-GPU Systems
Quantum Monte Carlo Endstation for Petascale Computing