skip to main content

DOE PAGESDOE PAGES

Title: An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.
Authors:
 [1] ;  [2] ;  [3] ;  [4]
  1. Lomonosov Moscow State Univ., Moscow (Russian Federation)
  2. RSC Technologies, Moscow (Russian Federation)
  3. Intel Corporation, Schaumburg, IL (United States)
  4. Argonne National Lab. (ANL), Argonne, IL (United States)
Publication Date:
Grant/Contract Number:
AC02-06CH11357
Type:
Accepted Manuscript
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Volume: 2017; Journal ID: ISSN 1094-3420
Publisher:
SAGE
Research Org:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22), Scientific User Facilities Division; Intel Corporation
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; GAMESS; Intel Xeon Phi; MPI; OpenMP; Parallel Hartree-Fock-Roothaan; integral computation; irregular computation; quantum chemistry
OSTI Identifier:
1401981
Alternate Identifier(s):
OSTI ID: 1402492

Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, and Alexeev, Yuri. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture. United States: N. p., Web. doi:10.1177/1094342017732628.
Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, & Alexeev, Yuri. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture. United States. doi:10.1177/1094342017732628.
Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, and Alexeev, Yuri. 2017. "An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture". United States. doi:10.1177/1094342017732628. https://www.osti.gov/servlets/purl/1401981.
@article{osti_1401981,
title = {An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture},
author = {Mironov, Vladimir and Moskovsky, Alexander and D’Mello, Michael and Alexeev, Yuri},
abstractNote = {The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.},
doi = {10.1177/1094342017732628},
journal = {International Journal of High Performance Computing Applications},
number = ,
volume = 2017,
place = {United States},
year = {2017},
month = {10}
}

Works referenced in this record:

General atomic and molecular electronic structure system
journal, November 1993
  • Schmidt, Michael W.; Baldridge, Kim K.; Boatz, Jerry A.
  • Journal of Computational Chemistry, Vol. 14, Issue 11, p. 1347-1363
  • DOI: 10.1002/jcc.540141112

NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations
journal, September 2010
  • Valiev, M.; Bylaska, E. J.; Govind, N.
  • Computer Physics Communications, Vol. 181, Issue 9, p. 1477-1489
  • DOI: 10.1016/j.cpc.2010.04.018