skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers

Abstract

Computing platforms are increasingly moving to accelerated architectures, and here we deal particularly with GPUs. In [15], a method was developed for atmospheric simulation to improve efficiency on large distributed memory machines by reducing communication demand and increasing the time step. Here, we improve upon this method to further target GPU accelerated platforms by reducing GPU memory accesses, removing a synchronization point, and better clustering computations. The modification ran over two times faster in some cases even though more computations were required, demonstrating the merit of improving memory handling on the GPU. Furthermore, we discover that the modification also has a near 100% hit rate in fast on-chip L1 cache and discuss the reasons for this. In concluding, we remark on further potential improvements to GPU efficiency.

Authors:
 [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1156699
DOE Contract Number:  
DE-AC05-00OR22725
Resource Type:
Book
Country of Publication:
United States
Language:
English
Subject:
atmospheric simulation; GPU; CUDA; profiling

Citation Formats

Norman, Matthew R. Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers. United States: N. p., 2013. Web.
Norman, Matthew R. Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers. United States.
Norman, Matthew R. 2013. "Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers". United States.
@article{osti_1156699,
title = {Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers},
author = {Norman, Matthew R},
abstractNote = {Computing platforms are increasingly moving to accelerated architectures, and here we deal particularly with GPUs. In [15], a method was developed for atmospheric simulation to improve efficiency on large distributed memory machines by reducing communication demand and increasing the time step. Here, we improve upon this method to further target GPU accelerated platforms by reducing GPU memory accesses, removing a synchronization point, and better clustering computations. The modification ran over two times faster in some cases even though more computations were required, demonstrating the merit of improving memory handling on the GPU. Furthermore, we discover that the modification also has a near 100% hit rate in fast on-chip L1 cache and discuss the reasons for this. In concluding, we remark on further potential improvements to GPU efficiency.},
doi = {},
url = {https://www.osti.gov/biblio/1156699}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jan 01 00:00:00 EST 2013},
month = {Tue Jan 01 00:00:00 EST 2013}
}

Book:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this book.

Save / Share: