A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Shan, Hongzhang; Williams, Samuel; Johnson, Calvin; McElvain, Kenneth

doi:10.1109/IPDPSW.2017.15

Title: A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Journal Article · Mon Jul 03 00:00:00 EDT 2017 · IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum

DOI:https://doi.org/10.1109/IPDPSW.2017.15· OSTI ID:1393243

Shan, Hongzhang ^[1]; Williams, Samuel ^[1]; Johnson, Calvin ^[2]; McElvain, Kenneth ^[3]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
San Diego State Univ., San Diego, CA (United States). Dept. of Physics
Univ. of California, Berkeley, CA (United States). Dept. of Physics

The Configuration Interaction (CI) method has been widely used to solve the non-relativistic many-body Schrodinger equation. One great challenge to implementing it efficiently on manycore architectures is its immense memory and data movement requirements. To address this issue, within each node, we exploit a hybrid MPI+OpenMP programming model in lieu of the traditional flat MPI programming model. Here in this paper, we develop optimizations that partition the workloads among OpenMP threads based on data locality,-which is essential in ensuring applications with complex data access patterns scale well on manycore architectures. The new algorithm scales to 256 threadson the 64-core Intel Knights Landing (KNL) manycore processor and 24 threads on dual-socket Ivy Bridge (Xeon) nodes. Compared with the original implementation, the performance has been improved by up to 7× on theKnights Landing processor and 3× on the dual-socket Ivy Bridge node.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

Grant/Contract Number:: AC02-05CH11231

OSTI ID:: 1393243

Journal Information:: IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, Vol. 2017; Conference: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lake Buena Vista, FL (United States), 29 May-2 Jun 2017; ISSN 2164-7062

Publisher:: IEEECopyright Statement

Country of Publication:: United States

Language:: English

Similar Records

An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

Journal Article · Mon Sep 19 00:00:00 EDT 2016 · Computer Physics Communications · OSTI ID:1393243

Vincenti, H.; Lobet, M.; Lehe, R.; +2 more

Performance and Energy Usage of Workloads on KNL and Haswell Architectures. In: High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation

Conference · Mon Jan 01 00:00:00 EST 2018 · Lecture Notes in Computer Science · OSTI ID:1393243

Allen, Tyler; Daley, Christopher S.; Doerfler, Douglas; +2 more

MILC staggered conjugate gradient performance on Intel KNL

Conference · Thu Nov 03 00:00:00 EDT 2016 · Proceedings of Science (POS) · OSTI ID:1393243

Li, Ruiz; Detar, Carleton; Doerfler, Douglas W.; +4 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Manycore
locality-based threading algorithm
bigstick
configuration-interaction method
knights landing
Ivy bridge
MPI
OpenMP
multithreading
hybrid programming model

Title: A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Citation Formats

Similar Records

Related Subjects