Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Addressing Load Imbalance in Bioinformatics and Biomedical Applications: Efficient Scheduling across Multiple GPUs

Conference ·

Computational bioinformatics and biomedical applications frequently contain heterogeneously sized units of work or tasks, for instance due to variability in the sizes of biological sequences and molecules. Variable-sized workloads lead to load imbalances in parallel implementations which detract from efficiency and performance. Many modern computing resources now have multiple graphics processing units(GPUs) per computer for acceleration. These multiple GPU resources need to be used efficiently through balancing of workloads across the GPUs. OpenMP is a portable directive-based parallel programming API used ubiquitously in bioscience applications to program CPUs; recently, the use of OpenMP directives for GPU acceleration has become possible. Here, motivated by experiences with imbalanced loads in GPU-accelerated bioinformatics applications, we address the load balancing problem using OpenMP task-to-GPU scheduling combined with OpenMP GPU offloading for multiply heterogeneous workloads – loads with both variable input sizes, and simultaneously, variable convergence rates for algorithms with a stochastic component – scheduled across multiple GPUs. We aim to develop strategies which are both easy to use and have lower overheads, and may be incorporated incrementally in existing programs which already make use of OpenMP for CPU-based threading in order to make use of multi-GPU computers. We test different combinations of input size variability and convergence rate variability, and characterize the effects of these different scenarios on the performance of scheduling strategies across multiple GPUs with OpenMP. We present several dynamic scheduling solutions for different parallel patterns, explore optimizations, and provide publicly available example computational kernels to make these strategies easy to use in programs. This work will enable application developers to efficiently and easily use multiple GPUs for imbalanced workloads found in bioinformatics and biomedical applications.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1842598
Country of Publication:
United States
Language:
English

References (16)

Analysis of several key factors influencing deep learning-based inter-residue contact prediction journal August 2019
Enabling OpenMP Task Parallelism on Multi-FPGAs conference May 2021
Load Balanced MinMin Algorithm for Static MetaTask Scheduling in Grid Computing journal April 2011
The Ongoing Evolution of OpenMP journal November 2018
Evaluating the use of GPUs in liver image segmentation and HMMER database searches conference May 2009
High-throughput virtual laboratory for drug discovery using massive datasets journal March 2021
Big Data: Astronomical or Genomical? journal July 2015
GPU-Accelerated Drug Discovery with Docking on the Summit Supercomputer: Porting, Optimization, and Application to COVID-19 Research
  • LeGrand, Scott; Scheinberg, Aaron; Tillack, Andreas F.
  • BCB '20: 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics https://doi.org/10.1145/3388440.3412472
conference September 2020
Performance Portability of Molecular Docking Miniapp On Leadership Computing Platforms conference November 2020
NCBI prokaryotic genome annotation pipeline journal June 2016
Locality-Aware Scheduling for Scalable Heterogeneous Environments conference November 2020
CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations journal July 2014
Supercomputing Pipelines Search for Therapeutics Against COVID-19 journal January 2020
CoreTSAR: Adaptive Worksharing for Heterogeneous Systems book January 2014
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading conference January 2017
The Protein Data Bank journal January 2000

Similar Records

Efficient Parallelization of Irregular Applications on GPU Architectures
Thesis/Dissertation · 2023 · OSTI ID:2349242

Lattice Quantum Chromodynamics with Overlap Fermions on GPUs
Journal Article · 2015 · Computing in Science and Engineering · OSTI ID:1565379

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed
Conference · 2023 · OSTI ID:1960691

Related Subjects