Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores
Abstract
The newest NERSC supercomputer Cori is a Cray XC40 system consisting of 2,388 Intel Xeon Haswell nodes and 9,688 Intel Xeon-Phi “Knights Landing” (KNL) nodes. Compared to the Xeon-based clusters NERSC users are familiar with, optimal performance on Cori requires consideration of KNL mode settings; process, thread, and memory affinity; fine-grain parallelization; vectorization; and use of the high-bandwidth MCDRAM memory. This paper describes our efforts preparing NERSC users for KNL through the NERSC Exascale Science Application Program, Web documentation, and user training. We discuss how we configured the Cori system for usability and productivity, addressing programming concerns, batch system configurations, and default KNL cluster and memory modes. Here, system usage data, job completion analysis, programming and running jobs issues, and a few successful user stories on KNL are presented.
- Authors:
-
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Univ. of Oklahoma, Norman, OK (United States)
- Hamburger Sternwarte, Hamburg (Germany)
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1459400
- Grant/Contract Number:
- AC02-05CH11231
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Concurrency and Computation. Practice and Experience
- Additional Journal Information:
- Journal Volume: 30; Journal Issue: 1; Related Information: Copyright © 2017 John Wiley & Sons, Ltd.; Journal ID: ISSN 1532-0626
- Publisher:
- Wiley
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; cross compilation; Intel Xeon Phi; KNL; performance optimization; process and thread affinity; user support; training; web documentation; heterogeneous; cluster and memory modes; NESAP
Citation Formats
He, Yun, Cook, Brandon, Deslippe, Jack, Friesen, Brian, Gerber, Richard, Hartman-Baker, Rebecca, Koniges, Alice, Kurth, Thorsten, Leak, Stephen, Yang, Woo -Sun, Zhao, Zhengji, Baron, Eddie, and Hauschildt, Peter. Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores. United States: N. p., 2017.
Web. doi:10.1002/cpe.4291.
He, Yun, Cook, Brandon, Deslippe, Jack, Friesen, Brian, Gerber, Richard, Hartman-Baker, Rebecca, Koniges, Alice, Kurth, Thorsten, Leak, Stephen, Yang, Woo -Sun, Zhao, Zhengji, Baron, Eddie, & Hauschildt, Peter. Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores. United States. https://doi.org/10.1002/cpe.4291
He, Yun, Cook, Brandon, Deslippe, Jack, Friesen, Brian, Gerber, Richard, Hartman-Baker, Rebecca, Koniges, Alice, Kurth, Thorsten, Leak, Stephen, Yang, Woo -Sun, Zhao, Zhengji, Baron, Eddie, and Hauschildt, Peter. Fri .
"Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores". United States. https://doi.org/10.1002/cpe.4291. https://www.osti.gov/servlets/purl/1459400.
@article{osti_1459400,
title = {Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores},
author = {He, Yun and Cook, Brandon and Deslippe, Jack and Friesen, Brian and Gerber, Richard and Hartman-Baker, Rebecca and Koniges, Alice and Kurth, Thorsten and Leak, Stephen and Yang, Woo -Sun and Zhao, Zhengji and Baron, Eddie and Hauschildt, Peter},
abstractNote = {The newest NERSC supercomputer Cori is a Cray XC40 system consisting of 2,388 Intel Xeon Haswell nodes and 9,688 Intel Xeon-Phi “Knights Landing” (KNL) nodes. Compared to the Xeon-based clusters NERSC users are familiar with, optimal performance on Cori requires consideration of KNL mode settings; process, thread, and memory affinity; fine-grain parallelization; vectorization; and use of the high-bandwidth MCDRAM memory. This paper describes our efforts preparing NERSC users for KNL through the NERSC Exascale Science Application Program, Web documentation, and user training. We discuss how we configured the Cori system for usability and productivity, addressing programming concerns, batch system configurations, and default KNL cluster and memory modes. Here, system usage data, job completion analysis, programming and running jobs issues, and a few successful user stories on KNL are presented.},
doi = {10.1002/cpe.4291},
journal = {Concurrency and Computation. Practice and Experience},
number = 1,
volume = 30,
place = {United States},
year = {Fri Aug 25 00:00:00 EDT 2017},
month = {Fri Aug 25 00:00:00 EDT 2017}
}
Web of Science
Works referenced in this record:
BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures
journal, June 2012
- Deslippe, Jack; Samsonidze, Georgy; Strubbe, David A.
- Computer Physics Communications, Vol. 183, Issue 6
Evaluating and Optimizing the NERSC Workload on Knights Landing
conference, November 2016
- Barnes, Taylor; Cook, Brandon; Deslippe, Jack
- 2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)
Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set
journal, July 1996
- Kresse, G.; Furthmüller, J.
- Computational Materials Science, Vol. 6, Issue 1, p. 15-50
Roofline: an insightful visual performance model for multicore architectures
journal, April 2009
- Williams, Samuel; Waterman, Andrew; Patterson, David
- Communications of the ACM, Vol. 52, Issue 4
Works referencing / citing this record:
Harnessing billions of tasks for a scalable portable hydrodynamic simulation of the merger of two stars
journal, September 2018
- Heller, Thomas; Lelbach, Bryce Adelstein; Huck, Kevin A.
- The International Journal of High Performance Computing Applications, Vol. 33, Issue 4
MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture
journal, July 2019
- Li, Chuang; Li, Kenli; Li, Keqin
- BMC Bioinformatics, Vol. 20, Issue 1