Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Exploring the Use of Novel Spatial Accelerators in Scientific Applications

Conference ·
Driven by the need to find alternative accelerators which can viably replace GPUs in next-generation Supercomputing systems, this paper proposes a methodology to enable agile application/hardware co-design. The application-first methodology provides the ability to come up with design of accelerators while working with real-world workloads, available accelerators, and system software. The iterative design process targets a set of kernels in a workload for performance estimates that can prune the design space for later phases of detailed architectural evaluations. To this effect, in this paper, a novel data-parallel device model is introduced that simulates the latency of performance-sensitive operations in an accelerator including data transfers and kernel computation using multi-core CPUs. The use of off-the-shelf simulators, such as pre-RTL simulator Aladdin or multiple tools available for exploring the design of deep neural network accelerators (e.g., Timeloop) is demonstrated for evaluation of various accelerator designs using applications with realistic inputs. Examples of multiple device configurations that are instantiable in a system are explored to evaluate the performance benefit of deploying novel accelerators. The proposed device is integrated with a programming model and system software to potentially explore the impacts of high-level programming languages/compilers and low-level effects such as task scheduling on multiple accelerators. We analyze our methodology for a set of applications that represent high-performance computing (HPC) and graph analytics. The applications include a computational chemistry kernel realized using tensor contractions, triangle counting, GraphSAGE and Breadth-first Search. These applications include kernels such as dense matrix-dense matrix multiplication, sparse matrix-spare matrix multiplication, and sparse matrix-dense vector multiplication. Our results indicate potential performance benefits and insights for system design by including accelerators that realize these kernels along-side general purpose accelerators.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1868037
Report Number(s):
PNNL-SA-169884
Country of Publication:
United States
Language:
English

Similar Records

Batched Sparse Linear Algebra (Final Report for Subcontract B648960)
Technical Report · Sun Dec 03 23:00:00 EST 2023 · OSTI ID:2228565

Modeling Analog Tile-Based Accelerators Using SST
Technical Report · Thu Sep 01 00:00:00 EDT 2022 · OSTI ID:1891950

A design methodology for domain-optimized power-efficient supercomputing
Conference · Wed Dec 31 23:00:00 EST 2008 · OSTI ID:1407081