Targeting Atmospheric Simulation Algorithms for Large Distributed Memory GPU Accelerated Computers
- ORNL
Computing platforms are increasingly moving to accelerated architectures, and here we deal particularly with GPUs. In [15], a method was developed for atmospheric simulation to improve efficiency on large distributed memory machines by reducing communication demand and increasing the time step. Here, we improve upon this method to further target GPU accelerated platforms by reducing GPU memory accesses, removing a synchronization point, and better clustering computations. The modification ran over two times faster in some cases even though more computations were required, demonstrating the merit of improving memory handling on the GPU. Furthermore, we discover that the modification also has a near 100% hit rate in fast on-chip L1 cache and discuss the reasons for this. In concluding, we remark on further potential improvements to GPU efficiency.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 1156699
- Country of Publication:
- United States
- Language:
- English
Similar Records
GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method
Scaling Out a Combinatorial Algorithm for Discovering Carcinogenic Gene Combinations to Thousands of GPUs