A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

Lashuk, Ilya; Chandramowlishwaran, Aparna; Langston, Harper; Nguyen, Tuan-Anh; Sampath, Rahul S; Shringarpure, Aashay; Vuduc, Richard; Ying, Lexing; Zorin, Denis; Biros, George

doi:10.1145/2160718.2160740

Title: A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

Journal Article · Sun Jan 01 00:00:00 EST 2012 · Communications of the ACM

DOI:https://doi.org/10.1145/2160718.2160740· OSTI ID:1039644

Lashuk, Ilya ^[1]; Chandramowlishwaran, Aparna ^[2]; Langston, Harper ^[2]; Nguyen, Tuan-Anh ^[2]; Sampath, Rahul S ^[3]; Shringarpure, Aashay ^[2]; Vuduc, Richard ^[2]; Ying, Lexing ^[4]; Zorin, Denis ^[5]; Biros, George ^[4]

Lawrence Livermore National Laboratory (LLNL)
Georgia Institute of Technology
ORNL
University of Texas, Austin
New York University

We describe a parallel fast multipole method (FMM) for highly nonuniform distributions of particles. We employ both distributed memory parallelism (via MPI) and shared memory parallelism (via OpenMP and GPU acceleration) to rapidly evaluate two-body nonoscillatory potentials in three dimensions on heterogeneous high performance computing architectures. We have performed scalability tests with up to 30 billion particles on 196,608 cores on the AMD/CRAY-based Jaguar system at ORNL. On a GPU-enabled system (NSF's Keeneland at Georgia Tech/ORNL), we observed 30x speedup over a single core CPU and 7x speedup over a multicore CPU implementation. By combining GPUs with MPI, we achieve less than 10 ns/particle and six digits of accuracy for a run with 48 million nonuniformly distributed particles on 192 GPUs.

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)

Sponsoring Organization:: USDOE Office of Science (SC)

DOE Contract Number:: DE-AC05-00OR22725

OSTI ID:: 1039644

Journal Information:: Communications of the ACM, Vol. 55, Issue 5; ISSN 0001-0782

Country of Publication:: United States

Language:: English

Similar Records

A Massively Parallel Adaptive Fast-Multipole Method on Heterogeneous Architectures

Conference · Thu Jan 01 00:00:00 EST 2009 · OSTI ID:1039644

Lashuk, Ilya; Chandramowlishwaran, Aparna; Langston, Harper; +7 more

Quantum Monte Carlo Endstation for Petascale Computing

Technical Report · Wed Mar 02 00:00:00 EST 2011 · OSTI ID:1039644

Ceperley, David

Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system. In: XSEDE '12 Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond, Article No. 4

Conference · Sun Jan 01 00:00:00 EST 2012 · OSTI ID:1039644

Humphrey, Alan; Meng, Qingyu; Berzins, Martin; +1 more

Related Subjects

99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
ACCELERATION
ACCURACY
COMPUTERS
COMPUTER CODES
DIMENSIONS
IMPLEMENTATION
MULTIPOLES
ORNL
PERFORMANCE
PARALLEL PROCESSING

Title: A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

Citation Formats

Similar Records

Related Subjects