An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; Alexeev, Yuri

doi:10.1177/1094342017732628

Title: An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

Journal Article · Wed Oct 04 00:00:00 EDT 2017 · International Journal of High Performance Computing Applications

DOI:https://doi.org/10.1177/1094342017732628· OSTI ID:1401981

Mironov, Vladimir ^[1]; Moskovsky, Alexander ^[2]; D’Mello, Michael ^[3]; Alexeev, Yuri ^[4]

Lomonosov Moscow State Univ., Moscow (Russian Federation)
RSC Technologies, Moscow (Russian Federation)
Intel Corporation, Schaumburg, IL (United States)
Argonne National Lab. (ANL), Argonne, IL (United States)

The Hartree–Fock method in the General Atomic and Molecular Structure System (GAMESS) quantum chemistry package represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals and the building of the Fock matrix. These are the central components of the main self consistent field (SCF) loop, the key hot spot in electronic structure codes. By threading the Message Passing Interface (MPI) ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4× to 6× for large systems) but also achieve a significant (>2>2×) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel® Xeon Phi™ supercomputer. Scaling numbers are reported on up to 7680 cores on Intel Xeon Phi coprocessors.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Argonne National Laboratory (ANL), Argonne, IL (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22), Scientific User Facilities Division; Intel Corporation

Grant/Contract Number:: AC02-06CH11357

OSTI ID:: 1401981

Alternate ID(s):: OSTI ID: 1402492

Journal Information:: International Journal of High Performance Computing Applications, Vol. 2017; ISSN 1094-3420

Publisher:: SAGECopyright Statement

Country of Publication:: United States

Language:: English

Citation Metrics:

Cited by: 12 works

Citation information provided by
Web of Science

References (23)

Acceleration of the GAMESS-UK electronic structure package on graphical processing units Wilkinson, Karl A.; Sherwood, Paul; Guest, Martyn F. Journal of Computational Chemistry, Vol. 32, Issue 10 https://doi.org/10.1002/jcc.21815	journal	May 2011
Efficient electronic integrals and their generalized derivatives for object oriented implementations of electronic structure calculations Flocke, N.; Lotrich, V. Journal of Computational Chemistry, Vol. 29, Issue 16 https://doi.org/10.1002/jcc.21018	journal	December 2008
One- and two-electron integrals over cartesian gaussian functions McMurchie, Larry E.; Davidson, Ernest R. Journal of Computational Physics, Vol. 26, Issue 2 https://doi.org/10.1016/0021-9991(78)90092-X	journal	February 1978
MPI/OpenMP Hybrid Parallel Algorithm for Hartree−Fock Calculations Ishimura, Kazuya; Kuramoto, Kei; Ikuta, Yasuhiro Journal of Chemical Theory and Computation, Vol. 6, Issue 4 https://doi.org/10.1021/ct100083w	journal	March 2010
The Performance Characterization of the RSC PetaStream Module Semin, Andrey; Druzhinin, Egor; Mironov, Vladimir Lecture Notes in Computer Science https://doi.org/10.1007/978-3-319-07518-1_27	book	January 2014
General atomic and molecular electronic structure system Schmidt, Michael W.; Baldridge, Kim K.; Boatz, Jerry A. Journal of Computational Chemistry, Vol. 14, Issue 11, p. 1347-1363 https://doi.org/10.1002/jcc.540141112	journal	November 1993
Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation Ufimtsev, Ivan S.; Martínez, Todd J. Journal of Chemical Theory and Computation, Vol. 4, Issue 2 https://doi.org/10.1021/ct700268q	journal	January 2008
The Heuristic Static Load-Balancing Algorithm Applied to the Community Earth System Model Alexeev, Yuri; Mickelson, Sheri; Leyffer, Sven 2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2014.177	conference	May 2014
Thread-level parallelization and optimization of NWChem for the Intel MIC architecture Shan, Hongzhang; Williams, Samuel; de Jong, Wibe Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '15 https://doi.org/10.1145/2712386.2712391	conference	January 2015
Macroscale superlubricity enabled by graphene nanoscroll formation Berman, D.; Deshmukh, S. A.; Sankaranarayanan, S. K. R. S. Science, Vol. 348, Issue 6239 https://doi.org/10.1126/science.1262024	journal	May 2015
Performance Tuning of Fock Matrix and Two-Electron Integral Calculations for NWChem on Leading HPC Platforms Shan, Hongzhang; Austin, Brian; De Jong, Wibe Lecture Notes in Computer Science https://doi.org/10.1007/978-3-319-10214-6_13	book	January 2014
Extending the Power of Quantum Chemistry to Large Systems with the Fragment Molecular Orbital Method Fedorov, Dmitri G.; Kitaura, Kazuo The Journal of Physical Chemistry A, Vol. 111, Issue 30 https://doi.org/10.1021/jp0716740	journal	August 2007
Parallelization of SCF calculations within Q-Chem Furlani, Thomas R.; Kong, Jing; Gill, Peter M. W. Computer Physics Communications, Vol. 128, Issue 1-2 https://doi.org/10.1016/S0010-4655(00)00059-X	journal	June 2000
Toward high-performance computational chemistry: II. A scalable self-consistent field program Harrison, Robert J.; Guest, Martyn F.; Kendall, Rick A. Journal of Computational Chemistry, Vol. 17, Issue 1 https://doi.org/10.1002/(SICI)1096-987X(19960115)17:1<124::AID-JCC10>3.0.CO;2-N	journal	January 1996
A parallel distributed data CPHF algorithm for analytic Hessians Alexeev, Yuri; Schmidt, Michael W.; Windus, Theresa L. Journal of Computational Chemistry, Vol. 28, Issue 10 https://doi.org/10.1002/jcc.20633	journal	January 2007
A New Scalable Parallel Algorithm for Fock Matrix Construction Liu, Xing; Patel, Aftab; Chow, Edmond 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2014.97	conference	May 2014
Libcint: An efficient general integral library for Gaussian basis functions Sun, Qiming Journal of Computational Chemistry, Vol. 36, Issue 22 https://doi.org/10.1002/jcc.23981	journal	June 2015
Horizontal vectorization of electron repulsion integrals Pritchard, Benjamin P.; Chow, Edmond Journal of Computational Chemistry, Vol. 37, Issue 28 https://doi.org/10.1002/jcc.24483	journal	September 2016
Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation Ufimtsev, Ivan S.; Martinez, Todd J. Journal of Chemical Theory and Computation, Vol. 5, Issue 4 https://doi.org/10.1021/ct800526s	journal	March 2009
New Multithreaded Hybrid CPU/GPU Approach to Hartree–Fock Asadchev, Andrey; Gordon, Mark S. Journal of Chemical Theory and Computation, Vol. 8, Issue 11 https://doi.org/10.1021/ct300526w	journal	September 2012
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations Valiev, M.; Bylaska, E. J.; Govind, N. Computer Physics Communications, Vol. 181, Issue 9, p. 1477-1489 https://doi.org/10.1016/j.cpc.2010.04.018	journal	September 2010
Efficient recursive computation of molecular integrals over Cartesian Gaussian functions Obara, S.; Saika, A. The Journal of Chemical Physics, Vol. 84, Issue 7 https://doi.org/10.1063/1.450106	journal	April 1986
Evaluation of molecular integrals over Gaussian basis functions Dupuis, Michel; Rys, John; King, Harry F. The Journal of Chemical Physics, Vol. 65, Issue 1 https://doi.org/10.1063/1.432807	journal	July 1976

Cited By (1)

Multithreaded parallelization of the energy and analytic gradient in the fragment molecular orbital method Mironov, Vladimir; Alexeev, Yuri; Fedorov, Dmitri G. International Journal of Quantum Chemistry, Vol. 119, Issue 12 https://doi.org/10.1002/qua.25937	journal	March 2019

Similar Records

Performance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel® Xeon Phi™ Processor

Conference · Fri Oct 20 00:00:00 EDT 2017 · OSTI ID:1401981

Bylaska, Eric J.; Jacquelin, Mathias; De Jong, Wibe A.; +2 more

Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor

Conference · Tue May 23 00:00:00 EDT 2017 · OSTI ID:1401981

Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack; +1 more

Efficient Implementation of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor

Conference · Mon Dec 01 00:00:00 EST 2014 · OSTI ID:1401981

Apra, Edoardo; Klemm, Michael; Kowalski, Karol

Related Subjects

97 MATHEMATICS AND COMPUTING
GAMESS
Intel Xeon Phi
MPI
OpenMP
Parallel Hartree-Fock-Roothaan
integral computation
irregular computation
quantum chemistry

Title: An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

Citation Formats

References (23)

Cited By (1)

Similar Records

Related Subjects