DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends

Abstract

Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts to extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unifiedmore » interface to both programming models to maintain the productivity of computational quantum chemists.« less

Authors:
 [1];  [2];  [1];  [3]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Q-Chem, Inc., Pleasanton, CA (United States)
  3. Univ. of Southern California, Los Angeles, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1379911
Alternate Identifier(s):
OSTI ID: 1396630
Grant/Contract Number:  
AC02-05CH11231; AC05-00OR22725; AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Parallel and Distributed Computing
Additional Journal Information:
Journal Volume: 106; Journal Issue: C; Journal ID: ISSN 0743-7315
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Tensor Contraction Engines; Quantum Chemistry; Libtensor; Cyclops; High Performance Computing; Distributed Memory Programming Models; Energy Efficiency

Citation Formats

Ibrahim, Khaled Z., Epifanovsky, Evgeny, Williams, Samuel, and Krylov, Anna I. Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends. United States: N. p., 2017. Web. doi:10.1016/j.jpdc.2017.02.010.
Ibrahim, Khaled Z., Epifanovsky, Evgeny, Williams, Samuel, & Krylov, Anna I. Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends. United States. https://doi.org/10.1016/j.jpdc.2017.02.010
Ibrahim, Khaled Z., Epifanovsky, Evgeny, Williams, Samuel, and Krylov, Anna I. Wed . "Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends". United States. https://doi.org/10.1016/j.jpdc.2017.02.010. https://www.osti.gov/servlets/purl/1379911.
@article{osti_1379911,
title = {Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends},
author = {Ibrahim, Khaled Z. and Epifanovsky, Evgeny and Williams, Samuel and Krylov, Anna I.},
abstractNote = {Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts to extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.},
doi = {10.1016/j.jpdc.2017.02.010},
journal = {Journal of Parallel and Distributed Computing},
number = C,
volume = 106,
place = {United States},
year = {Wed Mar 08 00:00:00 EST 2017},
month = {Wed Mar 08 00:00:00 EST 2017}
}

Journal Article:

Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

Save / Share: