A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Shan, Hongzhang; Williams, Samuel; Johnson, Calvin; McElvain, Kenneth

doi:10.1109/IPDPSW.2017.15

Title: A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Journal Article · Mon Jul 03 00:00:00 UTC 2017 · IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum

DOI: https://doi.org/10.1109/IPDPSW.2017.15 · OSTI ID:1393243

Shan, Hongzhang ^[1]; Williams, Samuel ^[1]; Johnson, Calvin ^[2]; McElvain, Kenneth ^[3]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
San Diego State Univ., San Diego, CA (United States). Dept. of Physics
Univ. of California, Berkeley, CA (United States). Dept. of Physics

The Configuration Interaction (CI) method has been widely used to solve the non-relativistic many-body Schrodinger equation. One great challenge to implementing it efficiently on manycore architectures is its immense memory and data movement requirements. To address this issue, within each node, we exploit a hybrid MPI+OpenMP programming model in lieu of the traditional flat MPI programming model. Here in this paper, we develop optimizations that partition the workloads among OpenMP threads based on data locality,-which is essential in ensuring applications with complex data access patterns scale well on manycore architectures. The new algorithm scales to 256 threadson the 64-core Intel Knights Landing (KNL) manycore processor and 24 threads on dual-socket Ivy Bridge (Xeon) nodes. Compared with the original implementation, the performance has been improved by up to 7× on theKnights Landing processor and 3× on the dual-socket Ivy Bridge node.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

Grant/Contract Number:: AC02-05CH11231

OSTI ID:: 1393243

Journal Information:: IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, Journal Name: IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum Vol. 2017; ISSN 2164-7062

Publisher:: IEEECopyright Statement

Country of Publication:: United States

Language:: English

Similar Records

MILC staggered conjugate gradient performance on Intel KNL

Conference · Thu Nov 03 04:00:00 UTC 2016 · Proceedings of Science (POS) · OSTI ID:1398438

Li, Ruiz; Detar, Carleton; Doerfler, Douglas W.; +4 more

Performance and Energy Usage of Workloads on KNL and Haswell Architectures. In: High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation

Conference · Mon Jan 01 04:00:00 UTC 2018 · Lecture Notes in Computer Science · OSTI ID:1546612

Allen, Tyler; Daley, Christopher S.; Doerfler, Douglas; +2 more

Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

Conference · Tue Sep 12 04:00:00 UTC 2017 · Concurrency and Computation. Practice and Experience · OSTI ID:1398460

Doerfler, Douglas; Austin, Brian; Cook, Brandon; +3 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Ivy bridge
MPI
Manycore
OpenMP
bigstick
configuration-interaction method
hybrid programming model
knights landing
locality-based threading algorithm
multithreading

Title: A Locality-Based Threading Algorithm for the Configuration-Interaction Method

Citation Formats

Similar Records

Related Subjects