skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks

Abstract

Here, we provide an extensive survey on wide spectrum of scheduling methods for multitasking among graphics processing unit (GPU) computing tasks. We then design several schedulers and explain in detail the selected methods we have developed to implement our scheduling strategies. Next, we compare the performance of schedulers on various workloads running on Fermi and Kepler architectures and arrive at the following major conclusions: (1) Small kernels benefit from running kernels concurrently. (2) The combination of small kernels, high-priority kernels with longer runtimes, and lower-priority kernels with shorter runtimes benefits from a CPU scheduler that dynamically changes kernel order on the Fermi architecture. (3) Because of limitations of existing GPU architectures, currently CPU schedulers outperform their GPU counterparts. We also provide results and observations obtained from implementing and evaluating our schedulers on the NVIDIA Jetson TX1 system-on-chip architecture. We observe that although TX1 has the newer Maxwell architecture, the mechanism used for scheduler timings behaves differently on TX1 compared to Kepler leading to incorrect timings. In this paper, we describe our methods that allow us to report correct timings for CPU schedulers running on TX1. Lastly, we propose new research directions involving the investigation of additional scheduling strategies.

Authors:
 [1];  [2]
  1. California State Univ., Sacramento, CA (United States)
  2. Univ. of California, Davis, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1528898
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Volume: 29; Journal Issue: 15; Journal ID: ISSN 1532-0626
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; GPU computing; multitasking; real‐time embedded tasks

Citation Formats

Muyan-Özçelik, Pınar, and Owens, John D. Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks. United States: N. p., 2017. Web. doi:10.1002/cpe.4118.
Muyan-Özçelik, Pınar, & Owens, John D. Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks. United States. doi:10.1002/cpe.4118.
Muyan-Özçelik, Pınar, and Owens, John D. Mon . "Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks". United States. doi:10.1002/cpe.4118. https://www.osti.gov/servlets/purl/1528898.
@article{osti_1528898,
title = {Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks},
author = {Muyan-Özçelik, Pınar and Owens, John D.},
abstractNote = {Here, we provide an extensive survey on wide spectrum of scheduling methods for multitasking among graphics processing unit (GPU) computing tasks. We then design several schedulers and explain in detail the selected methods we have developed to implement our scheduling strategies. Next, we compare the performance of schedulers on various workloads running on Fermi and Kepler architectures and arrive at the following major conclusions: (1) Small kernels benefit from running kernels concurrently. (2) The combination of small kernels, high-priority kernels with longer runtimes, and lower-priority kernels with shorter runtimes benefits from a CPU scheduler that dynamically changes kernel order on the Fermi architecture. (3) Because of limitations of existing GPU architectures, currently CPU schedulers outperform their GPU counterparts. We also provide results and observations obtained from implementing and evaluating our schedulers on the NVIDIA Jetson TX1 system-on-chip architecture. We observe that although TX1 has the newer Maxwell architecture, the mechanism used for scheduler timings behaves differently on TX1 compared to Kepler leading to incorrect timings. In this paper, we describe our methods that allow us to report correct timings for CPU schedulers running on TX1. Lastly, we propose new research directions involving the investigation of additional scheduling strategies.},
doi = {10.1002/cpe.4118},
journal = {Concurrency and Computation. Practice and Experience},
number = 15,
volume = 29,
place = {United States},
year = {2017},
month = {6}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share: