On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws

Loffeld, J.; Hittinger, J. A. F.

doi:10.1177/1094342017691876

Title: On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws

Abstract

It has been conjectured that higher-order discretizations for partial differential equations will have advantages over the lower-order counterparts commonly used today. The reasoning is that the increase in arithmetic operations will be more than offset by the reduction in data transfers and the increase in concurrent floating-point units. To evaluate this conjecture, the arithmetic intensity of a class of high-order finite-volume discretizations for hyperbolic systems of conservation laws is theoretically analyzed for spatial discretizations from orders three through eight in arbitrary dimensions. Additionally, three cache models are considered: the limiting cases of no cache and infinite cache as well as a finite-sized cache model. Models are validated experimentally by measuring floating-point operations and data transfers on an IBM Blue Gene/Q node. Theory and experiments demonstrate that high-order finite-volume methods will be able to provide increases in arithmetic intensity that will be necessary to make better utilization of on-node floating-point capability.

Authors:

Loffeld, J. ^[1]; Hittinger, J. A. F. ^[1]

Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

Publication Date:: Wed Feb 01 00:00:00 EST 2017

Research Org.:: Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Sponsoring Org.:: USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

OSTI Identifier:: 1860892

Report Number(s):: LLNL-JRNL-716535
Journal ID: ISSN 1094-3420; 859004

Grant/Contract Number:: AC52-07NA27344

Resource Type:: Accepted Manuscript

Journal Name:: International Journal of High Performance Computing Applications

Additional Journal Information:: Journal Volume: 33; Journal Issue: 1; Journal ID: ISSN 1094-3420

Publisher:: SAGE

Country of Publication:: United States

Language:: English

Subject:: 97 MATHEMATICS AND COMPUTING; arithmetic intensity; high-order finite-volume methods; hyperbolic systems of conservation laws; processor-memory performance gap; algorithmic balance

Citation Formats


                    Loffeld, J., and Hittinger, J. A. F. On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws.  United States: N. p., 2017. 
Web.  doi:10.1177/1094342017691876.

Copy to clipboard


                    Loffeld, J., & Hittinger, J. A. F. On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws.  United States.  https://doi.org/10.1177/1094342017691876

Copy to clipboard


                    Loffeld, J., and Hittinger, J. A. F. Wed .  
"On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws".  United States.  https://doi.org/10.1177/1094342017691876.  https://www.osti.gov/servlets/purl/1860892.

Copy to clipboard


                    
@article{osti_1860892,

  title        = {On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws},

  author       = {Loffeld, J. and Hittinger, J. A. F.},

  abstractNote = {It has been conjectured that higher-order discretizations for partial differential equations will have advantages over the lower-order counterparts commonly used today. The reasoning is that the increase in arithmetic operations will be more than offset by the reduction in data transfers and the increase in concurrent floating-point units. To evaluate this conjecture, the arithmetic intensity of a class of high-order finite-volume discretizations for hyperbolic systems of conservation laws is theoretically analyzed for spatial discretizations from orders three through eight in arbitrary dimensions. Additionally, three cache models are considered: the limiting cases of no cache and infinite cache as well as a finite-sized cache model. Models are validated experimentally by measuring floating-point operations and data transfers on an IBM Blue Gene/Q node. Theory and experiments demonstrate that high-order finite-volume methods will be able to provide increases in arithmetic intensity that will be necessary to make better utilization of on-node floating-point capability.},

  doi          = {10.1177/1094342017691876},

  journal      = {International Journal of High Performance Computing Applications},

  number       = 1,

  volume       = 33,

  place        = {United States},

  year         = {Wed Feb 01 00:00:00 EST 2017},

  month        = {Wed Feb 01 00:00:00 EST 2017}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Accepted Manuscript (DOE)

Publisher's Version of Record

https://doi.org/10.1177/1094342017691876

Other availability

Search WorldCat to find libraries that may hold this journal

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Improving the ratio of memory operations to floating-point operations in loops
journal, November 1994

Carr, Steve; Kennedy, Ken
ACM Transactions on Programming Languages and Systems, Vol. 16, Issue 6
DOI: 10.1145/197320.197366

A framework for hybrid parallel flow simulations with a trillion cells in complex geometries
conference, January 2013

Godenschwager, Christian; Schornbaum, Florian; Bauer, Martin
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13
DOI: 10.1145/2503210.2503273

11 PFLOP/s simulations of cloud cavitation collapse
conference, January 2013

Rossinelli, Diego; Koumoutsakos, Petros; Hejazialhosseini, Babak
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13
DOI: 10.1145/2503210.2504565

Understanding Application Performance via Micro-benchmarks on Three Large Supercomputers: Intrepid, Ranger and Jaguar
journal, May 2010

Bhatelé, Abhinav; Wesolowski, Lukasz; Bohm, Eric
The International Journal of High Performance Computing Applications, Vol. 24, Issue 4
DOI: 10.1177/1094342010370603

Estimating interlock and improving balance for pipelined architectures
journal, August 1988

Callahan, David; Cocke, John; Kennedy, Ken
Journal of Parallel and Distributed Computing, Vol. 5, Issue 4
DOI: 10.1016/0743-7315(88)90002-0

Multicore/Multi-GPU Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids
journal, January 2011

Rossinelli, Diego; Hejazialhosseini, Babak; Spampinato, Daniele G.
SIAM Journal on Scientific Computing, Vol. 33, Issue 2
DOI: 10.1137/100795930

I/O complexity: The red-blue pebble game
conference, January 1981

Jia-Wei, Hong; Kung, H. T.
Proceedings of the thirteenth annual ACM symposium on Theory of computing - STOC '81
DOI: 10.1145/800076.802486

A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers
conference, November 2014

Olschanowsky, Catherine; Strout, Michelle Mills; Guzik, Stephen
SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
DOI: 10.1109/SC.2014.70

Performance evaluations of gyrokinetic Eulerian code GT5D on massively parallel multi-core platforms
conference, January 2011

Idomura, Yasuhiro; Jolliet, Sébastien
State of the Practice Reports on - SC '11
DOI: 10.1145/2063348.2063354

An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations
journal, July 2015

Pananilath, Irshad; Acharya, Aravind; Vasista, Vinay
ACM Transactions on Architecture and Code Optimization, Vol. 12, Issue 2
DOI: 10.1145/2739047

Comparison of accurate methods for the integration of hyperbolic equations
journal, January 1972

Kreiss, Heinz-Otto; Oliger, Joseph
Tellus, Vol. 24, Issue 3
DOI: 10.3402/tellusa.v24i3.10634

Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model
conference, January 2015

Stengel, Holger; Treibig, Jan; Hager, Georg
Proceedings of the 29th ACM on International Conference on Supercomputing - ICS '15
DOI: 10.1145/2751205.2751240

Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws
book, January 1998

Shu, Chi-Wang
Lecture Notes in Mathematics
DOI: 10.1007/BFb0096355

Compiler-Directed Transformation for Higher-Order Stencils
conference, May 2015

Basu, Protonu; Hall, Mary; Williams, Samuel
2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
DOI: 10.1109/IPDPS.2015.103

Communication lower bounds and optimal algorithms for numerical linear algebra
journal, May 2014

Ballard, G.; Carson, E.; Demmel, J.
Acta Numerica, Vol. 23
DOI: 10.1017/S0962492914000038

High-order finite-volume methods for hyperbolic conservation laws on mapped multiblock grids
journal, May 2015

McCorquodale, P.; Dorr, M. R.; Hittinger, J. A. F.
Journal of Computational Physics, Vol. 288
DOI: 10.1016/j.jcp.2015.01.006

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes
conference, January 2013

Langguth, Johannes; Wu, Nan; Chai, Jun
Proceedings of the 3rd Workshop on Irregular Applications Architectures and Algorithms - IA^3 '13
DOI: 10.1145/2535753.2535765

Weighted Essentially Non-oscillatory Schemes
journal, November 1994

Liu, Xu-Dong; Osher, Stanley; Chan, Tony
Journal of Computational Physics, Vol. 115, Issue 1
DOI: 10.1006/jcph.1994.1187

Fully multidimensional flux-corrected transport algorithms for fluids
journal, June 1979

Zalesak, Steven T.
Journal of Computational Physics, Vol. 31, Issue 3
DOI: 10.1016/0021-9991(79)90051-2

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms
journal, September 2009

Williams, Samuel; Carter, Jonathan; Oliker, Leonid
Journal of Parallel and Distributed Computing, Vol. 69, Issue 9
DOI: 10.1016/j.jpdc.2009.04.002

Solving the compressible navier-stokes equations on up to 1.97 million cores and 4.1 trillion grid points
conference, November 2013

Bermejo-Moreno, Iván; Bodart, Julien; Larsson, Johan
SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
DOI: 10.1145/2503210.2503265

A 30 Year Retrospective on Dennard's MOSFET Scaling Paper
journal, January 2007

Bohr, Mark
IEEE Solid-State Circuits Newsletter, Vol. 12, Issue 1
DOI: 10.1109/N-SSC.2007.4785534

A high-order finite-volume method for conservation laws on locally refined grids
journal, January 2011

McCorquodale, Peter; Colella, Phillip
Communications in Applied Mathematics and Computational Science, Vol. 6, Issue 1
DOI: 10.2140/camcos.2011.6.1

The Piecewise Parabolic Method (PPM) for gas-dynamical simulations
journal, April 1984

Colella, Phillip; Woodward, Paul R.
Journal of Computational Physics, Vol. 54, Issue 1
DOI: 10.1016/0021-9991(84)90143-8

Strong Stability-Preserving High-Order Time Discretization Methods
journal, January 2001

Gottlieb, Sigal; Shu, Chi-Wang; Tadmor, Eitan
SIAM Review, Vol. 43, Issue 1
DOI: 10.1137/S003614450036757X

Hierarchical N-body Simulations with Autotuning for Heterogeneous Systems
journal, May 2012

Yokota, Rio; Barba, Lorena
Computing in Science & Engineering, Vol. 14, Issue 3
DOI: 10.1109/MCSE.2012.1

A performance analysis framework for identifying potential benefits in GPGPU applications
journal, September 2012

Sim, Jaewoong; Dasgupta, Aniruddha; Kim, Hyesoon
ACM SIGPLAN Notices, Vol. 47, Issue 8
DOI: 10.1145/2370036.2145819

Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method
journal, June 2014

Zhang, Wei; Wei, Wenjie; Cai, Xing
Fractional Calculus and Applied Analysis, Vol. 17, Issue 3
DOI: 10.2478/s13540-014-0189-x

Managing application complexity in the SAMRAI object-oriented framework
journal, January 2002

Hornung, Richard D.; Kohn, Scott R.
Concurrency and Computation: Practice and Experience, Vol. 14, Issue 5
DOI: 10.1002/cpe.652

Efficient Implementation of Weighted ENO Schemes
journal, June 1996

Jiang, Guang-Shan; Shu, Chi-Wang
Journal of Computational Physics, Vol. 126, Issue 1
DOI: 10.1006/jcph.1996.0130

High-order, finite-volume methods in mapped coordinates
journal, April 2011

Colella, P.; Dorr, M. R.; Hittinger, J. A. F.
Journal of Computational Physics, Vol. 230, Issue 8
DOI: 10.1016/j.jcp.2010.12.044

High throughput software for direct numerical simulations of compressible two-phase flows
conference, November 2012

Hejazialhosseini, Babak; Rossinelli, Diego; Conti, Christian
2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
DOI: 10.1109/SC.2012.66

Roofline: an insightful visual performance model for multicore architectures
journal, April 2009

Williams, Samuel; Waterman, Andrew; Patterson, David
Communications of the ACM, Vol. 52, Issue 4
DOI: 10.1145/1498765.1498785

An efficient mixed-precision, hybrid CPU–GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm
journal, June 2012

Chen, G.; Chacón, L.; Barnes, D. C.
Journal of Computational Physics, Vol. 231, Issue 16
DOI: 10.1016/j.jcp.2012.04.040

Flux-corrected transport. I. SHASTA, a fluid transport algorithm that works
journal, January 1973

Boris, Jay P.; Book, David L.
Journal of Computational Physics, Vol. 11, Issue 1
DOI: 10.1016/0021-9991(73)90147-2

Similar Records in DOE PAGES and OSTI.GOV collections:

A high-order finite-volume method for hyperbolic conservation laws on locally-refined grids

Journal Article McCorquodale, Peter ; Colella, Phillip - Communications in Applied Mathematics and Computational Science

We present a fourth-order accurate finite-volume method for solving time-dependent hyperbolic systems of conservation laws on Cartesian grids with multiple levels of refinement. The underlying method is a generalization of that in [5] to nonlinear systems, and is based on using fourth-order accurate quadratures for computing fluxes on faces, combined with fourth-order accurate Runge?Kutta discretization in time. To interpolate boundary conditions at refinement boundaries, we interpolate in time in a manner consistent with the individual stages of the Runge-Kutta method, and interpolate in space by solving a least-squares problem over a neighborhood of each target cell for the coefficients ofmore »« less
Full Text Available
An assessment of semi-discrete central schemes for hyperbolic conservation laws.

Technical Report Christon, Mark Allen ; Robinson, Allen Conrad ; Ketcheson, David Isaac

High-resolution finite volume methods for solving systems of conservation laws have been widely embraced in research areas ranging from astrophysics to geophysics and aero-thermodynamics. These methods are typically at least second-order accurate in space and time, deliver non-oscillatory solutions in the presence of near discontinuities, e.g., shocks, and introduce minimal dispersive and diffusive effects. High-resolution methods promise to provide greatly enhanced solution methods for Sandia's mainstream shock hydrodynamics and compressible flow applications, and they admit the possibility of a generalized framework for treating multi-physics problems such as the coupled hydrodynamics, electro-magnetics and radiative transport found in Z pinch physics. Inmore »« less
https://doi.org/10.2172/918357

Full Text Available
Data Locality Enhancement of Dynamic Simulations for Exascale Computing (Final Report)

Technical Report Shen, Xipeng

The development of modern processors exhibits two trends that complicate the optimizations of modern software. The first is the increasing sensitivity of processors' throughput to irregularities in computation. With more processors produced through a massive integration of simple cores, future systems will increasingly favor regular data-level parallel computations, but deviate from the needs of applications with complex patterns. Some evidences are already shown on Graphic Processing Units (GPU): Irregular data accesses (e.g., indirect references A[D[i]]) and conditional branches are limiting many GPU applications' performance at a level an order of magnitude lower than the peak of GPU. The second hardwaremore »« less
https://doi.org/10.2172/1576175

Full Text Available
Matrix-free preconditioning for high-order H(curl) discretizations

Journal Article Barker, Andrew T. ; Kolev, Tzanio - Numerical Linear Algebra with Applications

Abstract The greater arithmetic intensity of high‐order finite element discretizations makes them attractive for implementation on next‐generation hardware, but assembly of high‐order finite element operators as matrices is prohibitively expensive. As a result, the development of general algebraic solvers for such operators has been an open research challenge. Fast matrix‐free application of high‐order operators has received significant attention in the literature in the context of Poisson‐type problems, but preconditioners and solvers for inverting more general operators are not very well‐developed. In this paper, we consider the problem of preconditioning a definite Maxwell operator at high polynomial order without assembling amore »« less
https://doi.org/10.1002/nla.2348

Full Text Available
A cartesian grid embedded boundary method for hyperbolic conservation laws

Journal Article Colella, Phillip ; Graves, Daniel T ; Keen, Benjamin J ; ... - Journal of Computational Physics

We present a second-order Godunov algorithm to solve time-dependent hyperbolic systems of conservation laws on irregular domains. Our approach is based on a formally consistent discretization of the conservation laws on a finite-volume grid obtained from intersecting the domain with a Cartesian grid. We address the small-cell stability problem associated with such methods by hybridizing our conservative discretization with a stable, nonconservative discretization at irregular control volumes, and redistributing the difference in the mass increments to nearby cells in a way that preserves stability and local conservation. The resulting method is second-order accurate in L{sup 1} for smooth problems, andmore »« less
Full Text Available

Similar Records

Title: On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws

Abstract

Citation Formats

Improving the ratio of memory operations to floating-point operations in loops journal, November 1994

A framework for hybrid parallel flow simulations with a trillion cells in complex geometries conference, January 2013

11 PFLOP/s simulations of cloud cavitation collapse conference, January 2013

Understanding Application Performance via Micro-benchmarks on Three Large Supercomputers: Intrepid, Ranger and Jaguar journal, May 2010

Estimating interlock and improving balance for pipelined architectures journal, August 1988

Multicore/Multi-GPU Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids journal, January 2011

I/O complexity: The red-blue pebble game conference, January 1981

A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers conference, November 2014

Performance evaluations of gyrokinetic Eulerian code GT5D on massively parallel multi-core platforms conference, January 2011

An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations journal, July 2015

Comparison of accurate methods for the integration of hyperbolic equations journal, January 1972

Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model conference, January 2015

Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws book, January 1998

Compiler-Directed Transformation for Higher-Order Stencils conference, May 2015

Communication lower bounds and optimal algorithms for numerical linear algebra journal, May 2014

High-order finite-volume methods for hyperbolic conservation laws on mapped multiblock grids journal, May 2015

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes conference, January 2013

Weighted Essentially Non-oscillatory Schemes journal, November 1994

Fully multidimensional flux-corrected transport algorithms for fluids journal, June 1979

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms journal, September 2009

Solving the compressible navier-stokes equations on up to 1.97 million cores and 4.1 trillion grid points conference, November 2013

A 30 Year Retrospective on Dennard's MOSFET Scaling Paper journal, January 2007

A high-order finite-volume method for conservation laws on locally refined grids journal, January 2011

The Piecewise Parabolic Method (PPM) for gas-dynamical simulations journal, April 1984

Strong Stability-Preserving High-Order Time Discretization Methods journal, January 2001

Hierarchical N-body Simulations with Autotuning for Heterogeneous Systems journal, May 2012

A performance analysis framework for identifying potential benefits in GPGPU applications journal, September 2012

Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method journal, June 2014

Managing application complexity in the SAMRAI object-oriented framework journal, January 2002

Efficient Implementation of Weighted ENO Schemes journal, June 1996

High-order, finite-volume methods in mapped coordinates journal, April 2011

High throughput software for direct numerical simulations of compressible two-phase flows conference, November 2012

Roofline: an insightful visual performance model for multicore architectures journal, April 2009

An efficient mixed-precision, hybrid CPU–GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm journal, June 2012

Flux-corrected transport. I. SHASTA, a fluid transport algorithm that works journal, January 1973

Improving the ratio of memory operations to floating-point operations in loops
journal, November 1994

A framework for hybrid parallel flow simulations with a trillion cells in complex geometries
conference, January 2013

11 PFLOP/s simulations of cloud cavitation collapse
conference, January 2013

Understanding Application Performance via Micro-benchmarks on Three Large Supercomputers: Intrepid, Ranger and Jaguar
journal, May 2010

Estimating interlock and improving balance for pipelined architectures
journal, August 1988

Multicore/Multi-GPU Accelerated Simulations of Multiphase Compressible Flows Using Wavelet Adapted Grids
journal, January 2011

I/O complexity: The red-blue pebble game
conference, January 1981

A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers
conference, November 2014

Performance evaluations of gyrokinetic Eulerian code GT5D on massively parallel multi-core platforms
conference, January 2011

An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations
journal, July 2015

Comparison of accurate methods for the integration of hyperbolic equations
journal, January 1972

Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model
conference, January 2015

Essentially non-oscillatory and weighted essentially non-oscillatory schemes for hyperbolic conservation laws
book, January 1998

Compiler-Directed Transformation for Higher-Order Stencils
conference, May 2015

Communication lower bounds and optimal algorithms for numerical linear algebra
journal, May 2014

High-order finite-volume methods for hyperbolic conservation laws on mapped multiblock grids
journal, May 2015

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes
conference, January 2013

Weighted Essentially Non-oscillatory Schemes
journal, November 1994

Fully multidimensional flux-corrected transport algorithms for fluids
journal, June 1979

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms
journal, September 2009

Solving the compressible navier-stokes equations on up to 1.97 million cores and 4.1 trillion grid points
conference, November 2013

A 30 Year Retrospective on Dennard's MOSFET Scaling Paper
journal, January 2007

A high-order finite-volume method for conservation laws on locally refined grids
journal, January 2011

The Piecewise Parabolic Method (PPM) for gas-dynamical simulations
journal, April 1984

Strong Stability-Preserving High-Order Time Discretization Methods
journal, January 2001

Hierarchical N-body Simulations with Autotuning for Heterogeneous Systems
journal, May 2012

A performance analysis framework for identifying potential benefits in GPGPU applications
journal, September 2012

Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method
journal, June 2014

Managing application complexity in the SAMRAI object-oriented framework
journal, January 2002

Efficient Implementation of Weighted ENO Schemes
journal, June 1996

High-order, finite-volume methods in mapped coordinates
journal, April 2011

High throughput software for direct numerical simulations of compressible two-phase flows
conference, November 2012

Roofline: an insightful visual performance model for multicore architectures
journal, April 2009

An efficient mixed-precision, hybrid CPU–GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm
journal, June 2012

Flux-corrected transport. I. SHASTA, a fluid transport algorithm that works
journal, January 1973