Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Accelerating DCA++ (Dynamical Cluster Approximation) Scientific Application on the Summit supercomputer

Conference ·

Optimizing scientific applications on today’s accelerator-based high performance computing systems can be challenging, especially when multiple GPUs and CPUs with heterogeneous memories and persistent non-volatile memories are present. An example is Summit, an accelerator-based system at the Oak Ridge Leadership Computing Facility (OLCF) that is rated as the world’s fastest supercomputer to-date. New strategies are thus needed to expose the parallelism in legacy applications, while being amenable to efficient mapping to the underlying architecture.In this paper we discuss our experiences and strategies to port a scientific application, DCA++, to Summit. DCA++ is a highperformance research application that solves quantum manybody problems with a cutting edge quantum cluster algorithm, the dynamical cluster approximation.Our strategies aim to synergize the strengths of the different programming models in the code. These include: (a) streamlining the interactions between the CPU threads and the GPUs, (b) implementing computing kernels on the GPUs and decreasing CPU-GPU memory transfers, (c) allowing asynchronous GPU communications, and (d) increasing compute intensity by combining linear algebraic operations.Full-scale production runs using all 4600 Summit nodes attained a peak performance of 73.5 PFLOPS with a mixed precision implementation.We observed a perfect strong and weak scaling for the quantum Monte Carlo solver in DCA++, while encountering about 2× input/output (I/O) and MPI communication overhead on the time-to-solution for the full machine run. Our hardware agnostic optimizations are designed to alleviate the communication and I/O challenges observed, while improving the compute intensity and obtaining optimal performance on a complex, hybrid architecture like Summit.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1607140
Country of Publication:
United States
Language:
English

References (12)

Quantum cluster theories journal October 2005
Nonlocal dynamical correlations of strongly interacting electron systems journal September 1998
Continuous-time auxiliary-field Monte Carlo for quantum impurity models journal May 2008
Effective Hamiltonian for the superconducting Cu oxides journal March 1988
Dynamical mean-field theory of strongly correlated fermion systems and the limit of infinite dimensions journal January 1996
Computational Complexity and Fundamental Limitations to Fermionic Quantum Monte Carlo Simulations journal May 2005
Submatrix updates for the continuous-time auxiliary-field algorithm journal February 2011
Dynamical cluster approximation: Nonlocal dynamics of correlated electron systems journal May 2000
Efficient non-equidistant FFT approach to the measurement of single- and two-particle quantities in continuous time Quantum Monte Carlo methods journal December 2012
DCA++: A case for science driven application development for leadership computing platforms journal July 2009
The Resonating Valence Bond State in La2CuO4 and Superconductivity journal March 1987
Stability of a Method for Multiplying Complex Matrices with Three Real Matrix Multiplications journal July 1992

Similar Records

Announcing Supercomputer Summit
Multimedia · 2016 · OSTI ID:1259664

Strategies to Deploy and Scale Deep Learning on the Summit Supercomputer
Conference · 2019 · OSTI ID:1606652

Pre-exascale accelerated application development: The ORNL Summit experience
Journal Article · 2020 · IBM Journal of Research and Development · OSTI ID:1649509

Related Subjects