skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Optimizing fusion PIC code performance at scale on Cori Phase 2

Abstract

In this paper we present the results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system. The code has undergone substantial development to enable the use of vector instructions in its most expensive kernels within the NERSC Exascale Science Applications Program. We study the single-node performance of the code on an absolute scale using the roofline methodology to guide optimization efforts. We have obtained 2x speedups in single node performance due to enabling vectorization and performing memory layout optimizations. On multiple nodes, the code is shown to scale well up to 4000 nodes, near half the size of the machine. We discuss some communication bottlenecks that were identified and resolved during the work.

Authors:
;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1398507
DOE Contract Number:
AC02-05CH11231
Resource Type:
Conference
Resource Relation:
Conference: To be determined
Country of Publication:
United States
Language:
English
Subject:
70 PLASMA PHYSICS AND FUSION TECHNOLOGY; 97 MATHEMATICS AND COMPUTING

Citation Formats

Koskela, T. S., and Deslippe, J.. Optimizing fusion PIC code performance at scale on Cori Phase 2. United States: N. p., 2017. Web. doi:10.1007/978-3-319-67630-2_32.
Koskela, T. S., & Deslippe, J.. Optimizing fusion PIC code performance at scale on Cori Phase 2. United States. doi:10.1007/978-3-319-67630-2_32.
Koskela, T. S., and Deslippe, J.. Sun . "Optimizing fusion PIC code performance at scale on Cori Phase 2". United States. doi:10.1007/978-3-319-67630-2_32. https://www.osti.gov/servlets/purl/1398507.
@article{osti_1398507,
title = {Optimizing fusion PIC code performance at scale on Cori Phase 2},
author = {Koskela, T. S. and Deslippe, J.},
abstractNote = {In this paper we present the results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system. The code has undergone substantial development to enable the use of vector instructions in its most expensive kernels within the NERSC Exascale Science Applications Program. We study the single-node performance of the code on an absolute scale using the roofline methodology to guide optimization efforts. We have obtained 2x speedups in single node performance due to enabling vectorization and performing memory layout optimizations. On multiple nodes, the code is shown to scale well up to 4000 nodes, near half the size of the machine. We discuss some communication bottlenecks that were identified and resolved during the work.},
doi = {10.1007/978-3-319-67630-2_32},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Jul 23 00:00:00 EDT 2017},
month = {Sun Jul 23 00:00:00 EDT 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: