Optimizing Performance of Combustion Chemistry Solvers on Intel's Many Integrated Core (MIC) Architectures
- National Renewable Energy Laboratory (NREL), Golden, CO (United States)
This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved here through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unoptimized version on the Intel Xeon Phi. The strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.
- Research Organization:
- National Renewable Energy Laboratory (NREL), Golden, CO (United States)
- Sponsoring Organization:
- USDOE Office of Energy Efficiency and Renewable Energy (EERE)
- DOE Contract Number:
- AC36-08GO28308
- OSTI ID:
- 1373668
- Report Number(s):
- NREL/CP-2C00-68445
- Resource Relation:
- Conference: Presented at the 23rd AIAA Computational Fluid Dynamics Conference - AIAA AVIATION Forum, 5-9 June 2017, Denver, Colorado
- Country of Publication:
- United States
- Language:
- English
Similar Records
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures