A brief summary on formalizing parallel tensor distributions redistributions and algorithm derivations.
Large-scale datasets in computational chemistry typically require distributed-memory parallel methods to perform a special operation known as tensor contraction. Tensors are multidimensional arrays, and a tensor contraction is akin to matrix multiplication with special types of permutations. Creating an efficient algorithm and optimized im- plementation in this domain is complex, tedious, and error-prone. To address this, we develop a notation to express data distributions so that we can apply use automated methods to find optimized implementations for tensor contractions. We consider the spin-adapted coupled cluster singles and doubles method from computational chemistry and use our methodology to produce an efficient implementation. Experiments per- formed on the IBM Blue Gene/Q and Cray XC30 demonstrate impact both improved performance and reduced memory consumption.
- Publication Date:
- OSTI Identifier:
- Report Number(s):
- DOE Contract Number:
- Resource Type:
- Technical Report
- Research Org:
- Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Org:
- USDOE National Nuclear Security Administration (NNSA)
- Country of Publication:
- United States
Enter terms in the toolbar above to search the full text of this document for pages containing specific keywords.