Performance Models for the Spike Banded Linear System Solver

Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; Grama, Ananth

doi:10.1155/2011/426421

Title: Performance Models for the Spike Banded Linear System Solver

Full Record
Other Related Research

Abstract

With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners, compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilitiesmore »« less

Authors:

Manguoglu, Murat ^[1]; Saied, Faisal ^[2]; Sameh, Ahmed ^[2]; Grama, Ananth ^[2]

Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
Department of Computer Science, Purdue University, West Lafayette, IN, USA

Publication Date:: Sat Jan 01 00:00:00 EST 2011

Sponsoring Org.:: USDOE

OSTI Identifier:: 1243136

Grant/Contract Number:: FC52-08NA28617

Resource Type:: Published Article

Journal Name:: Scientific Programming

Additional Journal Information:: Journal Name: Scientific Programming Journal Volume: 19 Journal Issue: 1; Journal ID: ISSN 1058-9244

Publisher:: Hindawi Publishing Corporation

Country of Publication:: Egypt

Language:: English

Citation Formats


                    Manguoglu, Murat, Saied, Faisal, Sameh, Ahmed, and Grama, Ananth. Performance Models for the Spike Banded Linear System Solver.  Egypt: N. p., 2011. 
Web.  doi:10.1155/2011/426421.

Copy to clipboard


                    Manguoglu, Murat, Saied, Faisal, Sameh, Ahmed, & Grama, Ananth. Performance Models for the Spike Banded Linear System Solver.  Egypt.  https://doi.org/10.1155/2011/426421

Copy to clipboard


                    Manguoglu, Murat, Saied, Faisal, Sameh, Ahmed, and Grama, Ananth. Sat .  
"Performance Models for the Spike Banded Linear System Solver".  Egypt.  https://doi.org/10.1155/2011/426421.

Copy to clipboard


                    
@article{osti_1243136,

  title        = {Performance Models for the Spike Banded Linear System Solver},

  author       = {Manguoglu, Murat and Saied, Faisal and Sameh, Ahmed and Grama, Ananth},

  abstractNote = {With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners, compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated on diverse heterogeneous multiclusters – platforms for which performance prediction is particularly challenging. Finally, we provide predict the scalability of the Spike algorithm using up to 65,536 cores with our model. In this paper we extend the results presented in the Ninth International Symposium on Parallel and Distributed Computing.},

  doi          = {10.1155/2011/426421},

  journal      = {Scientific Programming},

  number       = 1,

  volume       = 19,

  place        = {Egypt},

  year         = {Sat Jan 01 00:00:00 EST 2011},

  month        = {Sat Jan 01 00:00:00 EST 2011}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Publisher's Version of Record
https://doi.org/10.1155/2011/426421

Other availability

Search WorldCat to find libraries that may hold this journal

Citation Metrics:

Cited by: 4 works

Citation information provided by
Web of Science

Save / Share:

Export Metadata

Save to My Library

Similar Records in DOE PAGES and OSTI.GOV collections:

Linear and Nonlinear Solvers for Simulating Multiphase Flow within Large-Scale Engineered Subsurface Systems

Journal Article Park, Heeho D. ; Hammond, Glenn E. ; Valocchi, Albert J. ; ... - Advances in Water Resources

Simulation of multiphase flow in the subsurface is well-known to be computationally challenging. While there have been many studies that have explored approaches to overcoming these challenges, they often utilize relatively simple case studies. In this paper, we focus on the unique numerical challenges posed by modeling large-scale engineered subsurface systems, characterized by discrete features embedded in a heterogeneous natural subsurface setting. The man-made features such as shafts, tunnels, and barriers often cause multiple challenges in modeling the domain for multiphase porous media flow. This flow scenario can have a wide range of applications such as nuclear waste repositories, enhancedmore »« less
https://doi.org/10.1016/j.advwatres.2021.104029
Hierarchical Petascale Simulation Framework For Stress Corrosion Cracking

Technical Report Grama, Ananth

A number of major accomplishments resulted from the project. These include: • Data Structures, Algorithms, and Numerical Methods for Reactive Molecular Dynamics. We have developed a range of novel data structures, algorithms, and solvers (amortized ILU, Spike) for use with ReaxFF and charge equilibration. • Parallel Formulations of ReactiveMD (Purdue ReactiveMolecular Dynamics Package, PuReMD, PuReMD-GPU, and PG-PuReMD) for Messaging, GPU, and GPU Cluster Platforms. We have developed efficient serial, parallel (MPI), GPU (Cuda), and GPU Cluster (MPI/Cuda) implementations. Our implementations have been demonstrated to be significantly better than the state of the art, both in terms of performance and scalability.more »« less
https://doi.org/10.2172/1111099

Full Text Available
An overlapping Domain Decomposition preconditioning method for monolithic solution of shear bands

Journal Article Berger-Vergiat, Luc ; Waisman, Haim - Computer Methods in Applied Mechanics and Engineering

Metals subjected to high strain rate impact often exhibit a sudden and profound drop in the material’s load bearing capability, a ductile failure phenomenon known as shear banding. Shear bands, characterized as material instabilities, are driven by shear heating and described as narrow regions that have sustained intense plastic deformation and high temperature rise. This coupled thermo-mechanical localization problem can be formulated as a nonlinear system with two balance equations, momentum and energy, and two constitutive laws for elasticity and plasticity. Here in our formulation, mixed finite elements are used to discretize the equations in space and an implicit finitemore »« less
Cited by 5
https://doi.org/10.1016/j.cma.2016.12.029

Full Text Available
Preconditioned least‐squares Petrov–Galerkin reduced order models

Journal Article Lindsay, Payton ; Fike, Jeffrey ; Tezaur, Irina ; ... - International Journal for Numerical Methods in Engineering

Abstract In this article, we introduce a methodology for improving the accuracy and efficiency of reduced order models (ROMs) constructed using the least‐squares Petrov–Galerkin (LSPG) projection method through the introduction of preconditioning. Unlike prior related work, which focuses on preconditioning the linear systems arising within the ROM numerical solution procedure to improve linear solver performance, our approach leverages a preconditioning matrix directly within the minimization problem underlying the LSPG formulation. Applying preconditioning in this way has the potential to improve ROM accuracy for several reasons. First, preconditioning the LSPG formulation changes the norm defining the residual minimization, which can improvemore »« less
https://doi.org/10.1002/nme.7056
Single-Bubble Flow Boiling Phenomena - Interface Tracking Simulation

Journal Article Li, Mengnan ; Bolotnov, Igor A. - Transactions of the American Nuclear Society

Boiling, as one of the most efficient heat transfer mechanisms, is widely used in various engineering systems. Better understanding and modeling of this process remains a major challenge in multiphase flow research. The distribution of vapor in boiling system affects the heat transfer rate and may cause unfavorable conditions, such as heater burn-out. A number of flow regimes have been observed experimentally. In general, there are three regimes under boiling conditions: nucleate boiling, transition boiling, and film boiling. The nucleate boiling regime is categorized into partial boiling regime and fully developed nucleate boiling, according to behavior of bubble dynamics andmore »« less

Similar Records