skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support

Abstract

In this article, we present the ExaScale PaRallel finite element tearing and interconnecting SOlver (ESPRESO) finite element method (FEM) library, which includes an FEM toolbox with interfaces to professional and open-source simulation tools, and a massively parallel hybrid total finite element tearing and interconnecting (HTFETI) solver which can fully utilize the Oak Ridge Leadership Computing Facility Titan supercomputer and achieve superlinear scaling. This article presents several new techniques for finite element tearing and interconnecting (FETI) solvers designed for efficient utilization of supercomputers with a focus on (i) performance—we present a fivefold reduction of solver runtime for the Laplace equation by redesigning the FETI solver and offloading the key workload to the accelerator. We compare Intel Xeon Phi 7120p and Tesla K80 and P100 accelerators to Intel Xeon E5-2680v3 and Xeon Phi 7210 central processing units; and (ii) memory efficiency—we present two techniques which increase the efficiency of the HTFETI solver 1.8 times and push the limits of the largest possible problem ESPRESO that can solve from 124 to 223 billion unknowns for problems with unstructured meshes. Finally, we show that by dynamically tuning hardware parameters, we can reduce energy consumption by up to 33%.

Authors:
ORCiD logo [1];  [2];  [1];  [1];  [1];  [1]; ORCiD logo [1];  [1];  [1]
  1. IT4Innovations, VSB-Technical University of Ostrava, Ostrava, Czech Republic
  2. IT4Innovations, VSB-Technical University of Ostrava, Ostrava, Czech Republic; Department of Applied Mathematics, VSB-Technical University of Ostrava, Ostrava, Czech Republic
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); UT-Battelle LLC/ORNL, Oak Ridge, TN (Unted States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1565782
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Journal Article
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Volume: 33; Journal Issue: 4; Journal ID: ISSN 1094-3420
Publisher:
SAGE
Country of Publication:
United States
Language:
English
Subject:
Computer Science

Citation Formats

Riha, Lubomir, Merta, Michal, Vavrik, Radim, Brzobohaty, Tomas, Markopoulos, Alexandros, Meca, Ondrej, Vysocky, Ondrej, Kozubek, Tomas, and Vondrak, Vit. A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support. United States: N. p., 2018. Web. doi:10.1177/1094342018798452.
Riha, Lubomir, Merta, Michal, Vavrik, Radim, Brzobohaty, Tomas, Markopoulos, Alexandros, Meca, Ondrej, Vysocky, Ondrej, Kozubek, Tomas, & Vondrak, Vit. A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support. United States. doi:10.1177/1094342018798452.
Riha, Lubomir, Merta, Michal, Vavrik, Radim, Brzobohaty, Tomas, Markopoulos, Alexandros, Meca, Ondrej, Vysocky, Ondrej, Kozubek, Tomas, and Vondrak, Vit. Wed . "A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support". United States. doi:10.1177/1094342018798452.
@article{osti_1565782,
title = {A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support},
author = {Riha, Lubomir and Merta, Michal and Vavrik, Radim and Brzobohaty, Tomas and Markopoulos, Alexandros and Meca, Ondrej and Vysocky, Ondrej and Kozubek, Tomas and Vondrak, Vit},
abstractNote = {In this article, we present the ExaScale PaRallel finite element tearing and interconnecting SOlver (ESPRESO) finite element method (FEM) library, which includes an FEM toolbox with interfaces to professional and open-source simulation tools, and a massively parallel hybrid total finite element tearing and interconnecting (HTFETI) solver which can fully utilize the Oak Ridge Leadership Computing Facility Titan supercomputer and achieve superlinear scaling. This article presents several new techniques for finite element tearing and interconnecting (FETI) solvers designed for efficient utilization of supercomputers with a focus on (i) performance—we present a fivefold reduction of solver runtime for the Laplace equation by redesigning the FETI solver and offloading the key workload to the accelerator. We compare Intel Xeon Phi 7120p and Tesla K80 and P100 accelerators to Intel Xeon E5-2680v3 and Xeon Phi 7210 central processing units; and (ii) memory efficiency—we present two techniques which increase the efficiency of the HTFETI solver 1.8 times and push the limits of the largest possible problem ESPRESO that can solve from 124 to 223 billion unknowns for problems with unstructured meshes. Finally, we show that by dynamically tuning hardware parameters, we can reduce energy consumption by up to 33%.},
doi = {10.1177/1094342018798452},
journal = {International Journal of High Performance Computing Applications},
issn = {1094-3420},
number = 4,
volume = 33,
place = {United States},
year = {2018},
month = {9}
}

Works referenced in this record:

An Approximate Minimum Degree Ordering Algorithm
journal, October 1996

  • Amestoy, Patrick R.; Davis, Timothy A.; Duff, Iain S.
  • SIAM Journal on Matrix Analysis and Applications, Vol. 17, Issue 4
  • DOI: 10.1137/S0895479894278952

A Highly Scalable Parallel Implementation of Balancing Domain Decomposition by Constraints
journal, January 2014

  • Badia, Santiago; Martín, Alberto F.; Principe, Javier
  • SIAM Journal on Scientific Computing, Vol. 36, Issue 2
  • DOI: 10.1137/130931989

Total FETI-an easier implementable variant of the FETI method for numerical solution of elliptic PDE
journal, June 2006

  • Dostál, Zdeněk; Horák, David; Kučera, Radek
  • Communications in Numerical Methods in Engineering, Vol. 22, Issue 12
  • DOI: 10.1002/cnm.881

Optimal convergence properties of the FETI domain decomposition method
journal, May 1994

  • Farhat, Charbel; Mandel, Jan; Roux, Francois Xavier
  • Computer Methods in Applied Mechanics and Engineering, Vol. 115, Issue 3-4
  • DOI: 10.1016/0045-7825(94)90068-X

Non-overlapping domain decomposition methods in structural mechanics
journal, December 2006

  • Gosselet, Pierre; Rey, Christian
  • Archives of Computational Methods in Engineering, Vol. 13, Issue 4
  • DOI: 10.1007/BF02905857

An overview of the Trilinos project
journal, September 2005

  • Heroux, Michael A.; Phipps, Eric T.; Salinger, Andrew G.
  • ACM Transactions on Mathematical Software, Vol. 31, Issue 3
  • DOI: 10.1145/1089014.1089021

Highly scalable parallel domain decomposition methods with an application to biomechanics
journal, January 2010


Accelerating sparse Cholesky factorization on GPUs
journal, November 2016


On Large-Scale Diagonalization Techniques for the Anderson Model of Localization
journal, January 2008

  • Schenk, Olaf; Bollhöfer, Matthias; Römer, Rudolf A.
  • SIAM Review, Vol. 50, Issue 1
  • DOI: 10.1137/070707002

Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization
journal, February 2007

  • Schenk, Olaf; Wächter, Andreas; Hagemann, Michael
  • Computational Optimization and Applications, Vol. 36, Issue 2-3
  • DOI: 10.1007/s10589-006-9003-y

The READEX formalism for automatic tuning for energy efficiency
journal, January 2017


Evaluation of the HPC Applications Dynamic Behavior in Terms of Energy Consumption
conference, May 2017

  • Vysocky, O.; Beseda, M.; Riha, L.
  • The Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering, Civil-Comp Proceedings
  • DOI: 10.4203/ccp.111.3