skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures

Abstract

Architectures are rapidly evolving, and exascale machines are expected to offer billion-way concurrency. We need to rethink algorithms, languages and programming models among other components in order to migrate large scale applications and explore parallelism on these machines. Although directive-based programming models allow programmers to worry less about programming and more about science, expressing complex parallel patterns in these models can be a daunting task especially when the goal is to match the performance that the hardware platforms can offer. One such pattern is wavefront. This paper extensively studies a wavefront-based miniapplication for Denovo, a production code for nuclear reactor modeling. We parallelize the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm in the main kernel of Minisweep (the miniapplication) using CUDA, OpenMP and OpenACC. Our OpenACC implementation running on NVIDIA's next-generation Volta GPU boasts an 85.06x speedup over serial code, which is larger than CUDA's 83.72x speedup over the same serial implementation. Our experimental platform includes SummitDev, an ORNL representative architecture of the upcoming Summit supercomputer. Our parallelization effort across platforms also motivated us to define an abstract parallelism model that is architecture independent, with a goal of creating software abstractions that can be used by applications employing the wavefront sweep motif.

Authors:
 [1];  [1]; ORCiD logo [2]; ORCiD logo [2]
  1. University of Delaware
  2. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1471933
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: Platform for Advanced Scientific Computing (PASC18) 2018 - Basel, , Switzerland - 7/2/2018 9:00:00 AM-7/4/2018 9:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Searles, Robert, Chandrasekaran, Sunita, Joubert, Wayne, and Hernandez, Oscar R. Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures. United States: N. p., 2018. Web. doi:10.1145/3218176.3218228.
Searles, Robert, Chandrasekaran, Sunita, Joubert, Wayne, & Hernandez, Oscar R. Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures. United States. doi:10.1145/3218176.3218228.
Searles, Robert, Chandrasekaran, Sunita, Joubert, Wayne, and Hernandez, Oscar R. Sun . "Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures". United States. doi:10.1145/3218176.3218228. https://www.osti.gov/servlets/purl/1471933.
@article{osti_1471933,
title = {Abstractions and Directives for Adapting Wavefront Algorithms to Future Architectures},
author = {Searles, Robert and Chandrasekaran, Sunita and Joubert, Wayne and Hernandez, Oscar R.},
abstractNote = {Architectures are rapidly evolving, and exascale machines are expected to offer billion-way concurrency. We need to rethink algorithms, languages and programming models among other components in order to migrate large scale applications and explore parallelism on these machines. Although directive-based programming models allow programmers to worry less about programming and more about science, expressing complex parallel patterns in these models can be a daunting task especially when the goal is to match the performance that the hardware platforms can offer. One such pattern is wavefront. This paper extensively studies a wavefront-based miniapplication for Denovo, a production code for nuclear reactor modeling. We parallelize the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm in the main kernel of Minisweep (the miniapplication) using CUDA, OpenMP and OpenACC. Our OpenACC implementation running on NVIDIA's next-generation Volta GPU boasts an 85.06x speedup over serial code, which is larger than CUDA's 83.72x speedup over the same serial implementation. Our experimental platform includes SummitDev, an ORNL representative architecture of the upcoming Summit supercomputer. Our parallelization effort across platforms also motivated us to define an abstract parallelism model that is architecture independent, with a goal of creating software abstractions that can be used by applications employing the wavefront sweep motif.},
doi = {10.1145/3218176.3218228},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2018},
month = {7}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: