A Communication-Optimal Framework for Contracting Distributed Tensors
Tensor contractions are extremely compute intensive generalized matrix multiplication operations encountered in many computational science fields, such as quantum chemistry and nuclear physics. Unlike distributed matrix multiplication, which has been extensively studied, limited work has been done in understanding distributed tensor contractions. In this paper, we characterize distributed tensor contraction algorithms on torus networks. We develop a framework with three fundamental communication operators to generate communication-efficient contraction algorithms for arbitrary tensor contractions. We show that for a given amount of memory per processor, our framework is communication optimal for all tensor contractions. We demonstrate performance and scalability of our framework on up to 262,144 cores of BG/Q supercomputer using five tensor contraction examples.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1178515
- Report Number(s):
- PNNL-SA-103670; KJ0402000
- Resource Relation:
- Conference: International Conference for High Performance Computing, Storage and Analysis (SC14), November 16-21, 2014, New Orleans, Louisiana, 375-386
- Country of Publication:
- United States
- Language:
- English
Similar Records
Cross-scale Efficient Tensor Contractions for Coupled Cluster Computations Through Multiple Programming Model Backends
Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale