Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4

Conference ·
OSTI ID:962937

We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicore-specific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4x when running on dual- and quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads per MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intra- and inter-node efficiency on large-scale multicore systems for demanding scientific applications.

Research Organization:
Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
Sponsoring Organization:
Computational Research Division
DOE Contract Number:
AC02-05CH11231
OSTI ID:
962937
Report Number(s):
LBNL-2088E
Country of Publication:
United States
Language:
English

Similar Records

Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms
Journal Article · Fri Apr 10 00:00:00 EDT 2009 · Journal of Parallel and Distributed Computing · OSTI ID:963653

Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
Conference · Thu Jan 31 23:00:00 EST 2008 · OSTI ID:964372

Cray XT4: An Early Evaluation for Petascale Scientific Simulation
Conference · Sun Dec 31 23:00:00 EST 2006 · OSTI ID:958817