Exploring the Use of Novel Spatial Accelerators in Scientific Applications

Ashraf, Rizwan A.; Gioiosa, Roberto

doi:10.1145/3489525.3511690

Exploring the Use of Novel Spatial Accelerators in Scientific Applications

Conference · Mon May 02 04:00:00 EDT 2022

DOI:https://doi.org/10.1145/3489525.3511690· OSTI ID:1868037

Ashraf, Rizwan A. ^[1]; Gioiosa, Roberto ^[1]

BATTELLE (PACIFIC NW LAB)

Driven by the need to find alternative accelerators which can viably replace GPUs in next-generation Supercomputing systems, this paper proposes a methodology to enable agile application/hardware co-design. The application-first methodology provides the ability to come up with design of accelerators while working with real-world workloads, available accelerators, and system software. The iterative design process targets a set of kernels in a workload for performance estimates that can prune the design space for later phases of detailed architectural evaluations. To this effect, in this paper, a novel data-parallel device model is introduced that simulates the latency of performance-sensitive operations in an accelerator including data transfers and kernel computation using multi-core CPUs. The use of off-the-shelf simulators, such as pre-RTL simulator Aladdin or multiple tools available for exploring the design of deep neural network accelerators (e.g., Timeloop) is demonstrated for evaluation of various accelerator designs using applications with realistic inputs. Examples of multiple device configurations that are instantiable in a system are explored to evaluate the performance benefit of deploying novel accelerators. The proposed device is integrated with a programming model and system software to potentially explore the impacts of high-level programming languages/compilers and low-level effects such as task scheduling on multiple accelerators. We analyze our methodology for a set of applications that represent high-performance computing (HPC) and graph analytics. The applications include a computational chemistry kernel realized using tensor contractions, triangle counting, GraphSAGE and Breadth-first Search. These applications include kernels such as dense matrix-dense matrix multiplication, sparse matrix-spare matrix multiplication, and sparse matrix-dense vector multiplication. Our results indicate potential performance benefits and insights for system design by including accelerators that realize these kernels along-side general purpose accelerators.

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1868037

Report Number(s):: PNNL-SA-169884

Country of Publication:: United States

Language:: English

Similar Records

Batched Sparse Linear Algebra (Final Report for Subcontract B648960)

Technical Report · Sun Dec 03 23:00:00 EST 2023 · OSTI ID:2228565

Modeling Analog Tile-Based Accelerators Using SST

Technical Report · Thu Sep 01 00:00:00 EDT 2022 · OSTI ID:1891950

A design methodology for domain-optimized power-efficient supercomputing

Conference · Wed Dec 31 23:00:00 EST 2008 · OSTI ID:1407081

Related Subjects

AI
codesign
Accelerated Computing
Modeling and Simulation
Hardware/Software Codesign
Design Space Exploration

Exploring the Use of Novel Spatial Accelerators in Scientific Applications

Citation Formats

Similar Records

Related Subjects