skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Auto-tuning Stencil Computations on Multicore and Accelerators

Book ·
DOI:https://doi.org/10.1201/b10376-18· OSTI ID:1407093
 [1];  [2];  [1];  [2];  [2];  [2];  [2]
  1. Univ. of California, Berkeley, CA (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

The recent transformation from an environment where gains in computational performance came from increasing clock frequency and other hardware engineering innovations, to an environment where gains are realized through the deployment of ever increasing numbers of modest performance cores has profoundly changed the landscape of scientific application programming. This exponential increase in core count represents both an opportunity and a challenge: access to petascale simulation capabilities and beyond will require that this concurrency be efficiently exploited.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1407093
Resource Relation:
Journal Volume: 20102756; Related Information: Book Title: Scientific Computing with Multicore and Accelerators
Country of Publication:
United States
Language:
English

References (17)

Validity of the single processor approach to achieving large scale computing capabilities conference January 1967
Adaptive mesh refinement for hyperbolic partial differential equations journal March 1984
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures conference November 2008
Chip multiprocessing and the cell broadband engine conference January 2006
Synergistic Processing in Cell's Multicore Architecture journal March 2006
Evaluating associativity in CPU caches journal January 1989
Introduction to the Cell multiprocessor journal July 2005
Implicit and explicit optimizations for stencil computations conference January 2006
Impact of modern memory subsystems on cache optimizations for stencil computations conference January 2005
Blocking and array contraction across arbitrarily nested loops using affine partitioning
  • Lim, Amy W.; Liao, Shih-Wei; Lam, Monica S.
  • Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming - PPoPP '01 https://doi.org/10.1145/379539.379586
conference January 2001
Tiling Optimizations for 3D Scientific Computations conference January 2000
Cache-Efficient Multigrid Algorithms journal February 2004
OSKI: A library of automatically tuned sparse matrix kernels journal January 2005
Automated empirical optimizations of software and the ATLAS project journal January 2001
Lattice Boltzmann simulation optimization on leading multicore platforms
  • Williams, Samuel; Carter, Jonathan; Oliker, Leonid
  • Distributed Processing Symposium (IPDPS), 2008 IEEE International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2008.4536295
conference April 2008
The roofline model: A pedagogical tool for program analysis and optimization conference August 2008
Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures report September 2009