Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Acceleration of Streamed Tensor Contraction Expressions on GPGPU-based Clusters

Conference ·

Tensor contractions are generalized multidimensional matrix multiplication operations that widely occur in quantum chemistry. Efficient execution of tensor contractions on GPUs requires tackling several challenges to be addressed, including index permutation and small dimension-sizes reducing thread block utilization. In this paper, we present our approach to automatically generate CUDA code to execute tensor contractions on GPUs, including management of data movement between CPU and GPU. GPU-enabled code is generated for the most expensive contractions in CCSD(T) and incorporated into NWChem, a popular computational chemistry suite. We demonstrate speedup over a factor of 8.4 using one core per node and over 2.6 when utilizing the entire system using hybrid CPU+GPU solution with 2 GPUs and 5 cores. We finally analyze the behavior of the application on future GPU systems.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
992816
Report Number(s):
PNNL-SA-73012
Country of Publication:
United States
Language:
English

Similar Records

Optimizing Tensor Contraction Expressions for Hybrid CPU-GPU Execution
Journal Article · Thu Feb 28 23:00:00 EST 2013 · Cluster Computing, 16(1):131-155 · OSTI ID:1076684

Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUs
Conference · Fri Jun 15 00:00:00 EDT 2018 · OSTI ID:1572874

A Code Generator for High-Performance Tensor Contractions on GPUs
Conference · Tue Feb 19 23:00:00 EST 2019 · OSTI ID:1617877