Auto-tuning Stencil Computations on Multicore and Accelerators

Datta, Kaushik; Williams, Samuel; Volkov, Vasily; Carter, Jonathan; Oliker, Leonid; Shalf, John; Yelick, Katherine

doi:10.1201/b10376-18

Title: Auto-tuning Stencil Computations on Multicore and Accelerators

Book · Tue Dec 07 00:00:00 EST 2010

DOI:https://doi.org/10.1201/b10376-18· OSTI ID:1407093

Datta, Kaushik ^[1]; Williams, Samuel ^[2]; Volkov, Vasily ^[1]; Carter, Jonathan ^[2]; Oliker, Leonid ^[2]; Shalf, John ^[2]; Yelick, Katherine ^[2]

Univ. of California, Berkeley, CA (United States)
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

The recent transformation from an environment where gains in computational performance came from increasing clock frequency and other hardware engineering innovations, to an environment where gains are realized through the deployment of ever increasing numbers of modest performance cores has profoundly changed the landscape of scientific application programming. This exponential increase in core count represents both an opportunity and a challenge: access to petascale simulation capabilities and beyond will require that this concurrency be efficiently exploited.

View Book

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 1407093

Resource Relation:: Journal Volume: 20102756; Related Information: Book Title: Scientific Computing with Multicore and Accelerators

Country of Publication:: United States

Language:: English

References (17)

Validity of the single processor approach to achieving large scale computing capabilities Amdahl, Gene M. Proceedings of the April 18-20, 1967, spring joint computer conference on - AFIPS '67 (Spring) https://doi.org/10.1145/1465482.1465560	conference	January 1967
Adaptive mesh refinement for hyperbolic partial differential equations Berger, Marsha J.; Oliger, Joseph Journal of Computational Physics, Vol. 53, Issue 3 https://doi.org/10.1016/0021-9991(84)90073-1	journal	March 1984
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures Datta, K.; Murphy, M.; Volkov, V. 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2008.5222004	conference	November 2008
Chip multiprocessing and the cell broadband engine Gschwind, Michael Proceedings of the 3rd conference on Computing frontiers - CF '06 https://doi.org/10.1145/1128022.1128023	conference	January 2006
Synergistic Processing in Cell's Multicore Architecture Gschwind, M.; Hofstee, H. P.; Flachs, B. IEEE Micro, Vol. 26, Issue 2 https://doi.org/10.1109/MM.2006.41	journal	March 2006
Evaluating associativity in CPU caches Hill, M. D.; Smith, A. J. IEEE Transactions on Computers, Vol. 38, Issue 12 https://doi.org/10.1109/12.40842	journal	January 1989
Introduction to the Cell multiprocessor Kahle, J. A.; Day, M. N.; Hofstee, H. P. IBM Journal of Research and Development, Vol. 49, Issue 4.5 https://doi.org/10.1147/rd.494.0589	journal	July 2005
Implicit and explicit optimizations for stencil computations Kamil, Shoaib; Datta, Kaushik; Williams, Samuel Proceedings of the 2006 workshop on Memory system performance and correctness - MSPC '06 https://doi.org/10.1145/1178597.1178605	conference	January 2006
Impact of modern memory subsystems on cache optimizations for stencil computations Kamil, Shoaib; Husbands, Parry; Oliker, Leonid Proceedings of the 2005 workshop on Memory system performance - MSP '05 https://doi.org/10.1145/1111583.1111589	conference	January 2005
Blocking and array contraction across arbitrarily nested loops using affine partitioning Lim, Amy W.; Liao, Shih-Wei; Lam, Monica S. Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming - PPoPP '01 https://doi.org/10.1145/379539.379586	conference	January 2001
Tiling Optimizations for 3D Scientific Computations Rivera, G. ACM/IEEE SC 2000 Conference (SC'00) https://doi.org/10.1109/SC.2000.10015	conference	January 2000
Cache-Efficient Multigrid Algorithms Sellappa, Sriram; Chatterjee, Siddhartha The International Journal of High Performance Computing Applications, Vol. 18, Issue 1 https://doi.org/10.1177/1094342004041295	journal	February 2004
OSKI: A library of automatically tuned sparse matrix kernels Vuduc, Richard; Demmel, James W.; Yelick, Katherine A. Journal of Physics: Conference Series, Vol. 16 https://doi.org/10.1088/1742-6596/16/1/071	journal	January 2005
Automated empirical optimizations of software and the ATLAS project Clint Whaley, R.; Petitet, Antoine; Dongarra, Jack J. Parallel Computing, Vol. 27, Issue 1-2 https://doi.org/10.1016/S0167-8191(00)00087-9	journal	January 2001
Lattice Boltzmann simulation optimization on leading multicore platforms Williams, Samuel; Carter, Jonathan; Oliker, Leonid Distributed Processing Symposium (IPDPS), 2008 IEEE International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2008.4536295	conference	April 2008
The roofline model: A pedagogical tool for program analysis and optimization Williams, Samuel; Patterson, David; Oliker, Leonid 2008 IEEE Hot Chips 20 Symposium (HCS) https://doi.org/10.1109/HOTCHIPS.2008.7476531	conference	August 2008
Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures Williams, Samuel; Waterman, Andrew; Patterson, David https://doi.org/10.2172/1407078	report	September 2009

Similar Records

Data Locality Enhancement of Dynamic Simulations for Exascale Computing (Final Report)

Technical Report · Fri Nov 29 00:00:00 EST 2019 · OSTI ID:1407093

Shen, Xipeng

Stencil Computation Optimization and Auto-tuning on State-of-the-Art Multicore Architectures

Conference · Fri Aug 22 00:00:00 EDT 2008 · OSTI ID:1407093

Datta, Kaushik; Murphy, Mark; Volkov, Vasily; +6 more

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

Conference · Fri Nov 21 00:00:00 EST 2008 · OSTI ID:1407093

Datta, K.; Murphy, M.; Volkov, V.; +6 more

Related Subjects

97 MATHEMATICS AND COMPUTING
43 PARTICLE ACCELERATORS

Title: Auto-tuning Stencil Computations on Multicore and Accelerators

Citation Formats

References (17)

Similar Records

Related Subjects