DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Runtime performance of a GAMESS quantum chemistry application offloaded to GPUs

Journal Article · · Concurrency and Computation. Practice and Experience
DOI: https://doi.org/10.1002/cpe.8244 · OSTI ID:2427011
ORCiD logo [1];  [2];  [3];  [3];  [3];  [3];  [4]
  1. Department of Electrical and Computer Engineering Old Dominion University Norfolk Virginia USA
  2. Association for Computing Machinery Zurich Switzerland
  3. Department of Chemistry and Ames Laboratory Iowa State University Ames Iowa USA
  4. EP Analytics, Inc San Diego California USA

Summary Computational chemistry is at the forefront of solving urgent societal problems, such as polymer upcycling and carbon capture. The complexity of modeling these processes at appropriate length and time scales is mainly manifested in the number and types of chemical species involved in the reactions and may require models of several thousand atoms and large basis sets to accurately capture the chemical complexity and heterogeneity in the physical and chemical processes. The quantum chemistry package General Atomic and Molecular Electronic Structure System (GAMESS) has a wide array of methods that can efficiently and accurately treat complex chemical systems. In this work, we have used the GAMESS Effective Fragment Molecule Orbital (EFMO) method for electronic structure calculation of a challenging mesoporous silica nanoparticle (MSN) model surrounded by about 4700 water molecules to investigate the strong scaling and GPU offloading on hybrid CPU‐GPU nodes. Experiments were performed on the Perlmutter platform at the National Energy Research Scientific Computing Center. Good strong scaling and load balancing have been observed on up to 88 hybrid nodes for different settings of the execution parameters for the calculation considered here. When GPUs are oversubscribed by offloading work from multiple CPU processes, using the NVIDIA multi‐process service (MPS) has consistently reduced time to solution and energy consumed. Additionally, for some configuration parameter settings, oversubscription with MPS improved performance by up to 5.8% over the case without oversubscription.

Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725; AC02-05CH11231
OSTI ID:
2427011
Journal Information:
Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 23 Vol. 36; ISSN 1532-0626
Publisher:
Wiley Blackwell (John Wiley & Sons)Copyright Statement
Country of Publication:
United Kingdom
Language:
English

References (11)

A new hierarchical parallelization scheme: Generalized distributed data interface (GDDI), and an application to the fragment molecular orbital method (FMO) journal January 2004
General atomic and molecular electronic structure system journal November 1993
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns journal December 2014
Porting Fragmentation Methods to Graphical Processing Units Using an OpenMP Application Programming Interface: Offloading the Fock Build for Low Angular Momentum Functions journal April 2023
Fragmentation Methods: A Route to Accurate Calculations on Large Systems journal August 2011
An effective fragment method for modeling solvent effects in quantum mechanical calculations journal August 1996
Recent developments in the general atomic and molecular electronic structure system journal April 2020
Porting fragmentation methods to GPUs using an OpenMP API: Offloading the resolution-of-the-identity second-order Møller–Plesset perturbation method journal April 2023
Enabling the Efficient Use of SMP Clusters: The GAMESS/DDI Model conference January 2003
Exploring AMD GPU Scheduling Details by Experimenting With “Worst Practices” conference April 2021
A survey of software implementations used by application codes in the Exascale Computing Project journal June 2021