Statistical and machine learning models for optimizing energy in parallel applications
Abstract
Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Here, we demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.
- Authors:
-
- The Univ. of Queensland, Brisbane (Australia)
- Cray Inc., Bloomington, MN (United States)
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Publication Date:
- Research Org.:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1573175
- Report Number(s):
- LLNL-JRNL-769611
Journal ID: ISSN 1094-3420; 960894
- Grant/Contract Number:
- AC52-07NA27344
- Resource Type:
- Journal Article: Accepted Manuscript
- Journal Name:
- International Journal of High Performance Computing Applications
- Additional Journal Information:
- Journal Volume: 33; Journal Issue: 6; Journal ID: ISSN 1094-3420
- Publisher:
- SAGE
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; energy efficiency; performance; regression modeling; machine learning; high performance computing
Citation Formats
Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, and de Supinski, Bronis R. Statistical and machine learning models for optimizing energy in parallel applications. United States: N. p., 2019.
Web. doi:10.1177/1094342019842915.
Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, & de Supinski, Bronis R. Statistical and machine learning models for optimizing energy in parallel applications. United States. https://doi.org/10.1177/1094342019842915
Endrei, Mark, Jin, Chao, Dinh, Minh Ngoc, Abramson, David, Poxon, Heidi, DeRose, Luiz, and de Supinski, Bronis R. 2019.
"Statistical and machine learning models for optimizing energy in parallel applications". United States. https://doi.org/10.1177/1094342019842915. https://www.osti.gov/servlets/purl/1573175.
@article{osti_1573175,
title = {Statistical and machine learning models for optimizing energy in parallel applications},
author = {Endrei, Mark and Jin, Chao and Dinh, Minh Ngoc and Abramson, David and Poxon, Heidi and DeRose, Luiz and de Supinski, Bronis R.},
abstractNote = {Rising power costs and constraints are driving a growing focus on the energy efficiency of high performance computing systems. The unique characteristics of a particular system and workload and their effect on performance and energy efficiency are typically difficult for application users to assess and to control. Settings for optimum performance and energy efficiency can also diverge, so we need to identify trade-off options that guide a suitable balance between energy use and performance. We present statistical and machine learning models that only require a small number of runs to make accurate Pareto-optimal trade-off predictions using parameters that users can control. We study model training and validation using several parallel kernels and more complex workloads, including Algebraic Multigrid (AMG), Large-scale Atomic Molecular Massively Parallel Simulator, and Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Here, we demonstrate that we can train the models using as few as 12 runs, with prediction error of less than 10%. Our AMG results identify trade-off options that provide up to 45% improvement in energy efficiency for around 10% performance loss. We reduce the sample measurement time required for AMG by 90%, from 13 h to 74 min.},
doi = {10.1177/1094342019842915},
url = {https://www.osti.gov/biblio/1573175},
journal = {International Journal of High Performance Computing Applications},
issn = {1094-3420},
number = 6,
volume = 33,
place = {United States},
year = {Thu Apr 25 00:00:00 EDT 2019},
month = {Thu Apr 25 00:00:00 EDT 2019}
}
Web of Science
Works referenced in this record:
On the Interplay of Parallelization, Program Performance, and Energy Consumption
journal, March 2010
- Cho, Sangyeun; Melhem, Rami G.
- IEEE Transactions on Parallel and Distributed Systems, Vol. 21, Issue 3
A Roofline Model of Energy
conference, May 2013
- Choi, Jee Whan; Bedard, Daniel; Fowler, Robert
- 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
Using multiple energy gears in MPI programs on a power-scalable cluster
conference, January 2005
- Freeh, Vincent W.; Lowenthal, David K.
- Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '05
Symbolic Description of Factorial Models for Analysis of Variance
journal, January 1973
- Wilkinson, G. N.; Rogers, C. E.
- Applied Statistics, Vol. 22, Issue 3
A case for application-oblivious energy-efficient MPI runtime
conference, January 2015
- Venkatesh, Akshay; Vishnu, Abhinav; Hamidouche, Khaled
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
Finding the limits of power-constrained application performance
conference, January 2015
- Bailey, Peter E.; Marathe, Aniruddha; Lowenthal, David K.
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
A regression-based approach to scalability prediction
conference, January 2008
- Barnes, Bradley J.; Rountree, Barry; Lowenthal, David K.
- Proceedings of the 22nd annual international conference on Supercomputing - ICS '08
Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget
conference, November 2014
- Sarood, Osman; Langer, Akhil; Gupta, Abhishek
- SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters
conference, September 2010
- Kandalla, Krishna; Mancini, Emilio P.; Sur, Sayantan
- 2010 39th International Conference on Parallel Processing (ICPP)
CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters
conference, September 2007
- Ge, Rong; Feng, Xizhou; Feng, Wu-chun
- 2007 International Conference on Parallel Processing (ICPP 2007)
Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems
journal, August 2011
- Lively, Charles; Wu, Xingfu; Taylor, Valerie
- Computer Science - Research and Development, Vol. 27, Issue 4
Towards fine-grained dynamic tuning of HPC applications on modern multi-core architectures
conference, January 2017
- Sourouri, Mohammed; Raknes, Espen Birger; Reissmann, Nico
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
The Parallel Research Kernels
conference, September 2014
- Van der Wijngaart, Rob F.; Mattson, Timothy G.
- 2014 IEEE High Performance Extreme Computing Conference (HPEC)
Methods of inference and learning for performance modeling of parallel applications
conference, January 2007
- Lee, Benjamin C.; Brooks, David M.; de Supinski, Bronis R.
- Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '07
Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs
conference, November 2006
- Lim, Min; Freeh, Vincent; Lowenthal, David
- ACM/IEEE SC 2006 Conference (SC'06)
Just-in-time dynamic voltage scaling: Exploiting inter-node slack to save energy in MPI programs
journal, September 2008
- Freeh, Vincent W.; Kappiah, Nandini; Lowenthal, David K.
- Journal of Parallel and Distributed Computing, Vol. 68, Issue 9
Online strategies for high-performance power-aware thread execution on emerging multiprocessors
conference, January 2006
- Curtis-Maury, M.; Dzierwa, J.; Antonopoulos, C. D.
- Proceedings 20th IEEE International Parallel & Distributed Processing Symposium
An ECM-based Energy-Efficiency Optimization Approach for Bandwidth-Limited Streaming Kernels on Recent Intel Xeon Processors
conference, November 2016
- Hofmann, Johannes; Fey, Dietmar
- 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC)
Nimrod: a tool for performing parametrised simulations using distributed workstations
conference, January 1995
- Abramson, D.; Sosic, R.; Giddy, J.
- Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing
Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation
conference, May 2011
- Song, Shuaiwen; Su, Chun-Yi; Ge, Rong
- Distributed Processing Symposium (IPDPS), 2011 IEEE International Parallel & Distributed Processing Symposium
High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systems
conference, January 2015
- Park, Jongsoo; Smelyanskiy, Mikhail; Yang, Ulrike Meier
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15
Matplotlib: A 2D Graphics Environment
journal, January 2007
- Hunter, John D.
- Computing in Science & Engineering, Vol. 9, Issue 3
Automated empirical tuning of scientific codes for performance and power consumption
conference, January 2011
- Rahman, Shah Faizur; Guo, Jichi; Yi, Qing
- Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers - HiPEAC '11
Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era
journal, December 2008
- Woo, Dong Hyuk; Lee, Hsien-Hsin S.
- Computer, Vol. 41, Issue 12
OpenTuner: an extensible framework for program autotuning
conference, January 2014
- Ansel, Jason; Kamil, Shoaib; Veeramachaneni, Kalyan
- Proceedings of the 23rd international conference on Parallel architectures and compilation - PACT '14
Adagio: making DVS practical for complex HPC applications
conference, January 2009
- Rountree, Barry; Lownenthal, David K.; de Supinski, Bronis R.
- Proceedings of the 23rd international conference on Conference on Supercomputing - ICS '09
Energy Efficiency Modeling of Parallel Applications
conference, November 2018
- Endrei, Mark; Jin, Chao; Dinh, Minh Ngoc
- SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
Bounding energy consumption in large-scale MPI programs
conference, January 2007
- Rountree, Barry; Lowenthal, David K.; Funk, Shelby
- Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07
A survey on software methods to improve the energy efficiency of parallel computing
journal, September 2016
- Jin, Chao; de Supinski, Bronis R.; Abramson, David
- The International Journal of High Performance Computing Applications, Vol. 31, Issue 6
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code
journal, February 2000
- Mckay, M. D.; Beckman, R. J.; Conover, W. J.
- Technometrics, Vol. 42, Issue 1
Measurement Error in Nonlinear Models.
journal, September 1997
- Hutton, J. L.; Carroll, R. J.; Ruppert, D.
- Biometrics, Vol. 53, Issue 3
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code
journal, May 1979
- McKay, M. D.; Beckman, R. J.; Conover, W. J.
- Technometrics, Vol. 21, Issue 2