skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Optimization of Forward Wave Modeling on Contemporary HPC Architectures

Abstract

Reverse Time Migration (RTM) is one of the main approaches in the seismic processing industry for imaging the subsurface structure of the Earth. While RTM provides qualitative advantages over its predecessors, it has a high computational cost warranting implementation on HPC architectures. We focus on three progressively more complex kernels extracted from RTM: for isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) media. In this work, we examine performance optimization of forward wave modeling, which describes the computational kernels used in RTM, on emerging multi- and manycore processors and introduce a novel common subexpression elimination optimization for TTI kernels. We compare attained performance and energy efficiency in both the single-node and distributed memory environments in order to satisfy industry’s demands for fidelity, performance, and energy efficiency. Moreover, we discuss the interplay between architecture (chip and system) and optimizations (both on-node computation) highlighting the importance of NUMA-aware approaches to MPI communication. Ultimately, our results show we can improve CPU energy efficiency by more than 10× on Magny Cours nodes while acceleration via multiple GPUs can surpass the energy-efficient Intel Sandy Bridge by as much as 3.6×.

Authors:
 [1];  [2];  [3]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. NVIDIA, Santa Clara, CA (United States)
  3. Fraunhofer ITWM, Kaiserslautern (Germany)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1223018
Report Number(s):
LBNL-5751E
DOE Contract Number:  
AC02-05CH11231
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Reverse Time Migration; forward wave modeling; multicore; GPU

Citation Formats

Krueger, Jens, Micikevicius, Paulius, and Williams, Samuel. Optimization of Forward Wave Modeling on Contemporary HPC Architectures. United States: N. p., 2012. Web. doi:10.2172/1223018.
Krueger, Jens, Micikevicius, Paulius, & Williams, Samuel. Optimization of Forward Wave Modeling on Contemporary HPC Architectures. United States. doi:10.2172/1223018.
Krueger, Jens, Micikevicius, Paulius, and Williams, Samuel. Fri . "Optimization of Forward Wave Modeling on Contemporary HPC Architectures". United States. doi:10.2172/1223018. https://www.osti.gov/servlets/purl/1223018.
@article{osti_1223018,
title = {Optimization of Forward Wave Modeling on Contemporary HPC Architectures},
author = {Krueger, Jens and Micikevicius, Paulius and Williams, Samuel},
abstractNote = {Reverse Time Migration (RTM) is one of the main approaches in the seismic processing industry for imaging the subsurface structure of the Earth. While RTM provides qualitative advantages over its predecessors, it has a high computational cost warranting implementation on HPC architectures. We focus on three progressively more complex kernels extracted from RTM: for isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) media. In this work, we examine performance optimization of forward wave modeling, which describes the computational kernels used in RTM, on emerging multi- and manycore processors and introduce a novel common subexpression elimination optimization for TTI kernels. We compare attained performance and energy efficiency in both the single-node and distributed memory environments in order to satisfy industry’s demands for fidelity, performance, and energy efficiency. Moreover, we discuss the interplay between architecture (chip and system) and optimizations (both on-node computation) highlighting the importance of NUMA-aware approaches to MPI communication. Ultimately, our results show we can improve CPU energy efficiency by more than 10× on Magny Cours nodes while acceleration via multiple GPUs can surpass the energy-efficient Intel Sandy Bridge by as much as 3.6×.},
doi = {10.2172/1223018},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2012},
month = {7}
}