ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Hanyang Univ., Seoul (Korea, Republic of)
- Intel Corporation, Hillsboro, OR (United States)
- Univ. of Utah, Salt Lake City, UT (United States)
As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application runtime and power/energy for energy efficient application execution, then use this framework to autotune four ECP proxy applications—XSBench, AMG, SWFFT, and SW4lite. Our approach uses Bayesian optimization with a Random Forest surrogate model to effectively search parameter spaces with up to 6 million different configurations on two large-scale HPC production systems, Theta at Argonne National Laboratory and Summit at Oak Ridge National Laboratory. The experimental results show that our autotuning framework at large scales has low overhead and achieves good scalability. Using the proposed autotuning framework to identify the best configurations, we achieve up to 91.59% performance improvement, up to 21.2% energy savings, and up to 37.84% EDP (energy delay product) improvement on up to 4096 nodes.
- Research Organization:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- National Science Foundation (NSF); USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC)
- Grant/Contract Number:
- AC02-06CH11357
- OSTI ID:
- 2475676
- Journal Information:
- Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 1 Vol. 37; ISSN 1532-0626; ISSN 1532-0634
- Publisher:
- WileyCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Automatic performance analysis with periscope
|
journal | January 2009 |
Machine learning-based auto-tuning for enhanced performance portability of OpenCL applications: Machine learning-based auto-tuning for enhanced performance portability of OpenCL applications
|
journal | December 2016 |
Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization
|
journal | November 2021 |
A Framework for Enabling OpenMP Autotuning
|
conference | January 2019 |
Global Extensible Open Power Manager: A Vehicle for HPC Community Collaboration on Co-Designed Energy Management Solutions
|
conference | May 2017 |
Bayesian Optimization of HPC Systems for Energy Efficiency
|
conference | May 2018 |
A Fourth Order Accurate Finite Difference Scheme for the Elastic Wave Equation in Second Order Formulation
|
journal | September 2011 |
Autotuning Under Tight Budget Constraints: A Transparent Design of Experiments Approach
|
conference | May 2019 |
Minimizing the cost of iterative compilation with active learning
|
conference | February 2017 |
ATF: A Generic Auto-Tuning Framework
|
conference | December 2017 |
Generating Efficient Tensor Contractions for GPUs
|
conference | September 2015 |
Online Adaptive Code Generation and Tuning
|
conference | May 2011 |
Nitro: A Framework for Adaptive Code Variant Tuning
|
conference | May 2014 |
Exploiting Performance Portability in Search Algorithms for Autotuning
|
conference | May 2016 |
Energy and Power Aware Job Scheduling and Resource Management: Global Survey — Initial Analysis
|
conference | May 2018 |
Autotuning Search Space for Loop Transformations
|
conference | November 2020 |
Standardizing Power Monitoring and Control at Exascale
|
journal | October 2016 |
Using Performance-Power Modeling to Improve Energy Efficiency of HPC Applications
|
journal | October 2016 |
Performance and Energy Improvement of ECP Proxy App SW4lite under Various Workloads
|
conference | November 2021 |
CLTune: A Generic Auto-Tuner for OpenCL Kernels
|
conference | September 2015 |
Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization
|
conference | November 2020 |
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations
|
conference | November 2021 |
Bayesian Optimization for auto-tuning GPU kernels
|
conference | November 2021 |
Automatically Tuned Linear Algebra Software
|
conference | January 1998 |
Active Harmony: Towards Automated Performance Tuning
|
conference | January 2002 |
Polly — Performing Polyhedral Optimizations on a Low-Level Intermediate Representation
|
journal | December 2012 |
RAPL: memory power estimation and capping
|
conference | January 2010 |
OpenTuner: an extensible framework for program autotuning
|
conference | January 2014 |
Performance and Power Characteristics and Optimizations of Hybrid MPI/OpenMP LULESH Miniapps under Various Workloads
|
conference | November 2017 |
Bootstrapping Parameter Space Exploration for Fast Tuning
|
conference | June 2018 |
Learning to optimize halide with tree search and random programs
|
journal | July 2019 |
Efficient hierarchical online-autotuning: a case study on polyhedral accelerator mapping
|
conference | June 2019 |
GPTune: multitask learning for autotuning exascale applications
|
conference | February 2021 |
Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models
|
conference | June 2021 |
Transfer-learning-based Autotuning using Gaussian Copula
|
conference | June 2023 |
| A Strawman for an HPC PowerStack | report | August 2018 |
Similar Records
Autotuning in High-Performance Computing Applications
Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimization