skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SST_GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.

Abstract

Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of massive amounts of data have created a space for massively parallel acceleration where the context for thousands of concurrent threads are resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several mas- sively parallel computing cores. The design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. To address the need for a scalable, decentralized GPU model that can model large GPUs, chiplet- based GPUs and multi-node GPUs, this report details the first steps in integrating the open-source, execution driven GPGPU-Sim into the SST framework. The first stage of this project, creates two elements: a kernel scheduler SST element accepts work from SST CPU models and schedules it to an SM-collection element that performs cycle-by-cycle timing using SSTs Mem Hierarchy to model a flexible memory system.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1497416
Report Number(s):
SAND2019-1967
672807
DOE Contract Number:  
AC04-94AL85000
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English

Citation Formats

Khairy, Mahmoud, Zhang, Mengchi, Green, Roland, Hammond, Simon David, Hoekstra, Robert J., Rogers, Timothy, and Hughes, Clayton. SST_GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.. United States: N. p., 2019. Web. doi:10.2172/1497416.
Khairy, Mahmoud, Zhang, Mengchi, Green, Roland, Hammond, Simon David, Hoekstra, Robert J., Rogers, Timothy, & Hughes, Clayton. SST_GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.. United States. https://doi.org/10.2172/1497416
Khairy, Mahmoud, Zhang, Mengchi, Green, Roland, Hammond, Simon David, Hoekstra, Robert J., Rogers, Timothy, and Hughes, Clayton. 2019. "SST_GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.". United States. https://doi.org/10.2172/1497416. https://www.osti.gov/servlets/purl/1497416.
@article{osti_1497416,
title = {SST_GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.},
author = {Khairy, Mahmoud and Zhang, Mengchi and Green, Roland and Hammond, Simon David and Hoekstra, Robert J. and Rogers, Timothy and Hughes, Clayton},
abstractNote = {Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of massive amounts of data have created a space for massively parallel acceleration where the context for thousands of concurrent threads are resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several mas- sively parallel computing cores. The design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. To address the need for a scalable, decentralized GPU model that can model large GPUs, chiplet- based GPUs and multi-node GPUs, this report details the first steps in integrating the open-source, execution driven GPGPU-Sim into the SST framework. The first stage of this project, creates two elements: a kernel scheduler SST element accepts work from SST CPU models and schedules it to an SM-collection element that performs cycle-by-cycle timing using SSTs Mem Hierarchy to model a flexible memory system.},
doi = {10.2172/1497416},
url = {https://www.osti.gov/biblio/1497416}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {2}
}