Parallel performance optimizations on unstructured mesh-based simulations
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Univ. of Maryland, College Park (United States)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Univ. of Oregon, Eugene, OR (United States)
This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intranode data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Basic Energy Sciences (BES)
- Grant/Contract Number:
- AC02-05CH11231; SC0006723
- OSTI ID:
- 1202396
- Journal Information:
- Procedia Computer Science, Vol. 51, Issue C; Conference: International Conference On Computational Science (ICCS 2015 ), Reykjavík (Iceland) , 1-3 Jun 2015; ISSN 1877-0509
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
A structure-exploiting numbering algorithm for finite elements on
extruded meshes, and its performance evaluation in Firedrake
|
journal | January 2016 |
Progress in Fast, Accurate Multi-scale Climate Simulations
|
journal | January 2015 |
A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake | text | January 2016 |
Similar Records
Improving Unstructured Mesh Partitions for Multiple Criteria Using Mesh Adjacencies
Ordering unstructured meshes for sparse matrix computations on leading parallel systems