skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Statistical and machine learning models for optimizing energy in parallel applications

Abstract

Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Here, we demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.

Authors:
ORCiD logo [1];  [1];  [1];  [1];  [2];  [2];  [3]
  1. The Univ. of Queensland, Brisbane (Australia)
  2. Cray Inc., Bloomington, MN (United States)
  3. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1573175
Report Number(s):
LLNL-JRNL-769611
Journal ID: ISSN 1094-3420; 960894
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Volume: 33; Journal Issue: 6; Journal ID: ISSN 1094-3420
Publisher:
SAGE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; energy efficiency; performance; regression modeling; machine learning; high performance computing

Citation Formats

Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, and de Supinski, Bronis R. Statistical and machine learning models for optimizing energy in parallel applications. United States: N. p., 2019. Web. doi:10.1177/1094342019842915.
Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, & de Supinski, Bronis R. Statistical and machine learning models for optimizing energy in parallel applications. United States. https://doi.org/10.1177/1094342019842915
Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, and de Supinski, Bronis R. 2019. "Statistical and machine learning models for optimizing energy in parallel applications". United States. https://doi.org/10.1177/1094342019842915. https://www.osti.gov/servlets/purl/1573175.
@article{osti_1573175,
title = {Statistical and machine learning models for optimizing energy in parallel applications},
author = {Endrei, Mark and Jin, Chao and Dinh, Minh Ngoc and Abramson, David and Poxon, Heidi and DeRose, Luiz and de Supinski, Bronis R.},
abstractNote = {Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Here, we demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.},
doi = {10.1177/1094342019842915},
url = {https://www.osti.gov/biblio/1573175}, journal = {International Journal of High Performance Computing Applications},
issn = {1094-3420},
number = 6,
volume = 33,
place = {United States},
year = {Thu Apr 25 00:00:00 EDT 2019},
month = {Thu Apr 25 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 3 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

On the Interplay of Parallelization, Program Performance, and Energy Consumption
journal, March 2010


A Roofline Model of Energy
conference, May 2013

  • Choi, Jee Whan; Bedard, Daniel; Fowler, Robert
  • 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
  • https://doi.org/10.1109/IPDPS.2013.77

Using multiple energy gears in MPI programs on a power-scalable cluster
conference, January 2005


Symbolic Description of Factorial Models for Analysis of Variance
journal, January 1973


A case for application-oblivious energy-efficient MPI runtime
conference, January 2015

  • Venkatesh, Akshay; Vishnu, Abhinav; Hamidouche, Khaled
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
  • https://doi.org/10.1145/2807591.2807658

Finding the limits of power-constrained application performance
conference, January 2015

  • Bailey, Peter E.; Marathe, Aniruddha; Lowenthal, David K.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
  • https://doi.org/10.1145/2807591.2807637

A regression-based approach to scalability prediction
conference, January 2008


Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget
conference, November 2014

  • Sarood, Osman; Langer, Akhil; Gupta, Abhishek
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
  • https://doi.org/10.1109/SC.2014.71

Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters
conference, September 2010


CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters
conference, September 2007


Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems
journal, August 2011


Towards fine-grained dynamic tuning of HPC applications on modern multi-core architectures
conference, January 2017

  • Sourouri, Mohammed; Raknes, Espen Birger; Reissmann, Nico
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
  • https://doi.org/10.1145/3126908.3126945

The Parallel Research Kernels
conference, September 2014


Methods of inference and learning for performance modeling of parallel applications
conference, January 2007

  • Lee, Benjamin C.; Brooks, David M.; de Supinski, Bronis R.
  • Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '07
  • https://doi.org/10.1145/1229428.1229479

Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs
conference, November 2006


Just-in-time dynamic voltage scaling: Exploiting inter-node slack to save energy in MPI programs
journal, September 2008


Online strategies for high-performance power-aware thread execution on emerging multiprocessors
conference, January 2006


An ECM-based Energy-Efficiency Optimization Approach for Bandwidth-Limited Streaming Kernels on Recent Intel Xeon Processors
conference, November 2016


Nimrod: a tool for performing parametrised simulations using distributed workstations
conference, January 1995


Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation
conference, May 2011

  • Song, Shuaiwen; Su, Chun-Yi; Ge, Rong
  • Distributed Processing Symposium (IPDPS), 2011 IEEE International Parallel & Distributed Processing Symposium
  • https://doi.org/10.1109/IPDPS.2011.22

High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systems
conference, January 2015

  • Park, Jongsoo; Smelyanskiy, Mikhail; Yang, Ulrike Meier
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
  • https://doi.org/10.1145/2807591.2807603

Matplotlib: A 2D Graphics Environment
journal, January 2007


Automated empirical tuning of scientific codes for performance and power consumption
conference, January 2011

  • Rahman, Shah Faizur; Guo, Jichi; Yi, Qing
  • Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers - HiPEAC '11
  • https://doi.org/10.1145/1944862.1944880

Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era
journal, December 2008


OpenTuner: an extensible framework for program autotuning
conference, January 2014

  • Ansel, Jason; Kamil, Shoaib; Veeramachaneni, Kalyan
  • Proceedings of the 23rd international conference on Parallel architectures and compilation - PACT '14
  • https://doi.org/10.1145/2628071.2628092

Adagio: making DVS practical for complex HPC applications
conference, January 2009


Energy Efficiency Modeling of Parallel Applications
conference, November 2018


Bounding energy consumption in large-scale MPI programs
conference, January 2007


A survey on software methods to improve the energy efficiency of parallel computing
journal, September 2016


Measurement Error in Nonlinear Models.
journal, September 1997


A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code
journal, May 1979