Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks

Muyan-Özçelik, Pınar; Owens, John D.

doi:10.1002/cpe.4118

Title: Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks

Journal Article · Mon Jun 05 00:00:00 EDT 2017 · Concurrency and Computation. Practice and Experience

DOI:https://doi.org/10.1002/cpe.4118· OSTI ID:1528898

Muyan-Özçelik, Pınar ^[1]; Owens, John D. ^[2]

California State Univ., Sacramento, CA (United States)
Univ. of California, Davis, CA (United States)

Here, we provide an extensive survey on wide spectrum of scheduling methods for multitasking among graphics processing unit (GPU) computing tasks. We then design several schedulers and explain in detail the selected methods we have developed to implement our scheduling strategies. Next, we compare the performance of schedulers on various workloads running on Fermi and Kepler architectures and arrive at the following major conclusions: (1) Small kernels benefit from running kernels concurrently. (2) The combination of small kernels, high-priority kernels with longer runtimes, and lower-priority kernels with shorter runtimes benefits from a CPU scheduler that dynamically changes kernel order on the Fermi architecture. (3) Because of limitations of existing GPU architectures, currently CPU schedulers outperform their GPU counterparts. We also provide results and observations obtained from implementing and evaluating our schedulers on the NVIDIA Jetson TX1 system-on-chip architecture. We observe that although TX1 has the newer Maxwell architecture, the mechanism used for scheduler timings behaves differently on TX1 compared to Kepler leading to incorrect timings. In this paper, we describe our methods that allow us to report correct timings for CPU schedulers running on TX1. Lastly, we propose new research directions involving the investigation of additional scheduling strategies.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC)

Grant/Contract Number:: AC02-05CH11231

OSTI ID:: 1528898

Journal Information:: Concurrency and Computation. Practice and Experience, Vol. 29, Issue 15; ISSN 1532-0626

Publisher:: WileyCopyright Statement

Country of Publication:: United States

Language:: English

Citation Metrics:

Cited by: 1 work

Citation information provided by
Web of Science

References (20)

Softshell: dynamic scheduling on GPUs Steinberger, Markus; Kainz, Bernhard; Kerbl, Bernhard ACM Transactions on Graphics, Vol. 31, Issue 6 https://doi.org/10.1145/2366145.2366180	journal	November 2012
The synchronous languages 12 years later Benveniste, A.; Caspi, P.; Edwards, S. A. Proceedings of the IEEE, Vol. 91, Issue 1 https://doi.org/10.1109/JPROC.2002.805826	journal	January 2003
The ESTEREL language Boussinot, F.; de Simone, R. Proceedings of the IEEE, Vol. 79, Issue 9 https://doi.org/10.1109/5.97299	journal	January 1991
OptiX: a general purpose ray tracing engine Parker, Steven G.; Robison, Austin; Stich, Martin ACM Transactions on Graphics, Vol. 29, Issue 4 https://doi.org/10.1145/1778765.1778803	journal	July 2010
GRAMPS: A programming model for graphics pipelines Sugerman, Jeremy; Fatahalian, Kayvon; Boulos, Solomon ACM Transactions on Graphics, Vol. 28, Issue 1 https://doi.org/10.1145/1477926.1477930	journal	January 2009
The synchronous data flow programming language LUSTRE Halbwachs, N.; Caspi, P.; Raymond, P. Proceedings of the IEEE, Vol. 79, Issue 9 https://doi.org/10.1109/5.97300	journal	January 1991
Programming real-time applications with SIGNAL LeGuernic, P.; Gautier, T.; Le Borgne, M. Proceedings of the IEEE, Vol. 79, Issue 9 https://doi.org/10.1109/5.97301	journal	January 1991
Out-of-core Data Management for Path Tracing on Hybrid Resources Budge, Brian; Bernardin, Tony; Stuart, Jeff A. Computer Graphics Forum, Vol. 28, Issue 2 https://doi.org/10.1111/j.1467-8659.2009.01378.x	journal	April 2009
Real-Time Speed-Limit-Sign Recognition on an Embedded System Using a GPU Muyan-Özçelik, Pinar; Glavtchev, Vladimir; Ota, Jeffrey M. GPU Computing Gems Emerald Edition https://doi.org/10.1016/B978-0-12-384988-5.00032-2	book	January 2011
Multitasking Real-time Embedded GPU Computing Tasks Muyan-Özçelik, Pιnar; Owens, John D. Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM'16 https://doi.org/10.1145/2883404.2883408	conference	January 2016
Fragment-Parallel Composite and Filter Patney, Anjul; Tzeng, Stanley; Owens, John D. Computer Graphics Forum, Vol. 29, Issue 4 https://doi.org/10.1111/j.1467-8659.2010.01720.x	journal	June 2010
Cooperative Multitasking for GPU-Accelerated Grid Systems Ino, Fumihiko; Ogita, Akihiro; Oita, Kentaro 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing https://doi.org/10.1109/CCGRID.2010.18	conference	May 2010
Efficiently Using a CUDA-enabled GPU as Shared Resource Peters, Hagen; Köper, Martin; Luttenberger, Norbert 2010 IEEE 10th International Conference on Computer and Information Technology (CIT), 2010 10th IEEE International Conference on Computer and Information Technology https://doi.org/10.1109/CIT.2010.204	conference	June 2010
Understanding the efficiency of ray traversal on GPUs Aila, Timo; Laine, Samuli Proceedings of the 1st ACM conference on High Performance Graphics - HPG '09 https://doi.org/10.1145/1572769.1572792	conference	January 2009
Message passing on data-parallel architectures Stuart, Jeff A.; Owens, John D. Distributed Processing (IPDPS), 2009 IEEE International Symposium on Parallel & Distributed Processing https://doi.org/10.1109/IPDPS.2009.5161065	conference	May 2009
GPU-to-CPU Callbacks Stuart, Jeff A.; Cox, Michael; Owens, John D. Euro-Par 2010 Parallel Processing Workshops https://doi.org/10.1007/978-3-642-21878-1_45	book	January 2011
Portable and transparent software managed scheduling on accelerators for fair resource sharing Margiolas, Christos; O'Boyle, Michael F. P. Proceedings of the 2016 International Symposium on Code Generation and Optimization - CGO 2016 https://doi.org/10.1145/2854038.2854040	conference	January 2016
PTask: operating system abstractions to manage GPUs as compute devices Rossbach, Christopher J.; Currey, Jon; Silberstein, Mark Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles - SOSP '11 https://doi.org/10.1145/2043556.2043579	conference	January 2011
Simultaneous Multikernel GPU: Multi-tasking throughput processors via fine-grained sharing Wang, Zhenning; Yang, Jun; Melhem, Rami 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) https://doi.org/10.1109/HPCA.2016.7446078	conference	March 2016
Analyzing CUDA workloads using a detailed GPU simulator Bakhoda, Ali; Yuan, George L.; Fung, Wilson W. L. Software (ISPASS), 2009 IEEE International Symposium on Performance Analysis of Systems and Software https://doi.org/10.1109/ISPASS.2009.4919648	conference	April 2009

Similar Records

ODDS: Real-Time Object Detection Using Depth Sensors on Embedded GPUs

Conference · Wed Apr 11 00:00:00 EDT 2018 · 2018 17th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN) · OSTI ID:1528898

Mithun, Niluthpol Chowdhury; Munir, Sirajum; Guo, Karen; +1 more

A performance model for GPUs with caches

Journal Article · Tue Jun 24 00:00:00 EDT 2014 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1528898

Dao, Thanh Tuan; Kim, Jungwon; Seo, Sangmin; +2 more

Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I

Technical Report · Sun Jul 24 00:00:00 EDT 2011 · OSTI ID:1528898

Schmalz, Mark S

Related Subjects

97 MATHEMATICS AND COMPUTING
GPU computing
multitasking
real‐time embedded tasks

Title: Methods for multitasking among real-time embedded compute tasks running on the GPU: Methods for Multitasking Real-time Embedded GPU Computing Tasks

Citation Formats

References (20)

Similar Records

Related Subjects