skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

Abstract

The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.

Authors:
 [1];  [2];  [3];  [4]
  1. Lomonosov Moscow State Univ., Moscow (Russian Federation)
  2. RSC Technologies, Moscow (Russian Federation)
  3. Intel Corporation, Schaumburg, IL (United States)
  4. Argonne National Lab. (ANL), Argonne, IL (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22), Scientific User Facilities Division; Intel Corporation
OSTI Identifier:
1401981
Alternate Identifier(s):
OSTI ID: 1402492
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Volume: 2017; Journal ID: ISSN 1094-3420
Publisher:
SAGE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; GAMESS; Intel Xeon Phi; MPI; OpenMP; Parallel Hartree-Fock-Roothaan; integral computation; irregular computation; quantum chemistry

Citation Formats

Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, and Alexeev, Yuri. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture. United States: N. p., 2017. Web. doi:10.1177/1094342017732628.
Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, & Alexeev, Yuri. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture. United States. doi:10.1177/1094342017732628.
Mironov, Vladimir, Moskovsky, Alexander, D’Mello, Michael, and Alexeev, Yuri. Wed . "An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture". United States. doi:10.1177/1094342017732628. https://www.osti.gov/servlets/purl/1401981.
@article{osti_1401981,
title = {An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture},
author = {Mironov, Vladimir and Moskovsky, Alexander and D’Mello, Michael and Alexeev, Yuri},
abstractNote = {The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.},
doi = {10.1177/1094342017732628},
journal = {International Journal of High Performance Computing Applications},
number = ,
volume = 2017,
place = {United States},
year = {Wed Oct 04 00:00:00 EDT 2017},
month = {Wed Oct 04 00:00:00 EDT 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

General atomic and molecular electronic structure system
journal, November 1993

  • Schmidt, Michael W.; Baldridge, Kim K.; Boatz, Jerry A.
  • Journal of Computational Chemistry, Vol. 14, Issue 11, p. 1347-1363
  • DOI: 10.1002/jcc.540141112

NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations
journal, September 2010

  • Valiev, M.; Bylaska, E. J.; Govind, N.
  • Computer Physics Communications, Vol. 181, Issue 9, p. 1477-1489
  • DOI: 10.1016/j.cpc.2010.04.018