Automatic Thread-Level Parallelization in the Chombo AMR Library

Christen, Matthias; Keen, Noel; Ligocki, Terry; Oliker, Leonid; Shalf, John; Van Straalen, Brian; Williams, Samuel

doi:10.2172/1051285

Title: Automatic Thread-Level Parallelization in the Chombo AMR Library

Technical Report · Thu May 26 00:00:00 EDT 2011

DOI:https://doi.org/10.2172/1051285· OSTI ID:1051285

Christen, Matthias; Keen, Noel; Ligocki, Terry; Oliker, Leonid; Shalf, John; Van Straalen, Brian; Williams, Samuel

The increasing on-chip parallelism has some substantial implications for HPC applications. Currently, hybrid programming models (typically MPI+OpenMP) are employed for mapping software to the hardware in order to leverage the hardware?s architectural features. In this paper, we present an approach that automatically introduces thread level parallelism into Chombo, a parallel adaptive mesh refinement framework for finite difference type PDE solvers. In Chombo, core algorithms are specified in the ChomboFortran, a macro language extension to F77 that is part of the Chombo framework. This domain-specific language forms an already used target language for an automatic migration of the large number of existing algorithms into a hybrid MPI+OpenMP implementation. It also provides access to the auto-tuning methodology that enables tuning certain aspects of an algorithm to hardware characteristics. Performance measurements are presented for a few of the most relevant kernels with respect to a specific application benchmark using this technique as well as benchmark results for the entire application. The kernel benchmarks show that, using auto-tuning, up to a factor of 11 in performance was gained with 4 threads with respect to the serial reference implementation.

View Technical Report

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: Computational Research Division

DOE Contract Number:: DE-AC02-05CH11231

OSTI ID:: 1051285

Report Number(s):: LBNL-5109E

Country of Publication:: United States

Language:: English

Similar Records

Performance Analysis of a High-Level Abstractions-Based Hydrocode on Future Computing Systems

Journal Article · · Lecture Notes in Computer Science · OSTI ID:1051285

None, None

Automatic generation of executable communication specifications from parallel applications

Journal Article · Wed Jan 19 00:00:00 EST 2011 · OSTI ID:1051285

Pakin, Scott; Wu, Xing; Mueller, Frank

MPI + OpenACC: Accelerating radiation transport mini-application, minisweep, on heterogeneous systems

Journal Article · Fri Mar 01 00:00:00 EST 2019 · Computer Physics Communications · OSTI ID:1051285

Searles, Robert; Chandrasekaran, Sunita; Joubert, Wayne; +1 more

Related Subjects

97 MATHEMATICS AND COMPUTING
auto-tuning
Chombo
ChomboFortran
HPC
OpenMP
hybrid
AMR

Title: Automatic Thread-Level Parallelization in the Chombo AMR Library

Citation Formats

Similar Records

Related Subjects