DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Understanding GPU Power. A Survey of Profiling, Modeling, and Simulation Methods

Abstract

Modern graphics processing units (GPUs) have complex architectures that admit exceptional performance and energy efficiency for high throughput applications.Though GPUs consume large amounts of power, their use for high throughput applications facilitate state-of-the-art energy efficiency and performance. Consequently, continued development relies on understanding their power consumption. Our work is a survey of GPU power modeling and profiling methods with increased detail on noteworthy efforts. Moreover, as direct measurement of GPU power is necessary for model evaluation and parameter initiation, internal and external power sensors are discussed. Hardware counters, which are low-level tallies of hardware events, share strong correlation to power use and performance. Statistical correlation between power and performance counters has yielded worthwhile GPU power models, yet the complexity inherent to GPU architectures presents new hurdles for power modeling. Developments and challenges of counter-based GPU power modeling is discussed. Often building on the counter-based models, research efforts for GPU power simulation, which make power predictions from input code and hardware knowledge, provide opportunities for optimization in programming or architectural design. Noteworthy strides in power simulations for GPUs are included along with their performance or functional simulator counterparts when appropriate. Lastly, possible directions for future research are discussed.

Authors:
 [1];  [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE
OSTI Identifier:
1326472
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
ACM Computing Surveys
Additional Journal Information:
Journal Volume: 49; Journal Issue: 3; Journal ID: ISSN 0360-0300
Publisher:
Association for Computing Machinery (ACM)
Country of Publication:
United States
Language:
English
Subject:
29 ENERGY PLANNING, POLICY, AND ECONOMY; 32 ENERGY CONSERVATION, CONSUMPTION, AND UTILIZATION; gpu; power; energy; profiling; simulator

Citation Formats

Bridges, Robert A., Imam, Neena, and Mintz, Tiffany M. Understanding GPU Power. A Survey of Profiling, Modeling, and Simulation Methods. United States: N. p., 2016. Web. doi:10.1145/2962131.
Bridges, Robert A., Imam, Neena, & Mintz, Tiffany M. Understanding GPU Power. A Survey of Profiling, Modeling, and Simulation Methods. United States. https://doi.org/10.1145/2962131
Bridges, Robert A., Imam, Neena, and Mintz, Tiffany M. Thu . "Understanding GPU Power. A Survey of Profiling, Modeling, and Simulation Methods". United States. https://doi.org/10.1145/2962131. https://www.osti.gov/servlets/purl/1326472.
@article{osti_1326472,
title = {Understanding GPU Power. A Survey of Profiling, Modeling, and Simulation Methods},
author = {Bridges, Robert A. and Imam, Neena and Mintz, Tiffany M.},
abstractNote = {Modern graphics processing units (GPUs) have complex architectures that admit exceptional performance and energy efficiency for high throughput applications.Though GPUs consume large amounts of power, their use for high throughput applications facilitate state-of-the-art energy efficiency and performance. Consequently, continued development relies on understanding their power consumption. Our work is a survey of GPU power modeling and profiling methods with increased detail on noteworthy efforts. Moreover, as direct measurement of GPU power is necessary for model evaluation and parameter initiation, internal and external power sensors are discussed. Hardware counters, which are low-level tallies of hardware events, share strong correlation to power use and performance. Statistical correlation between power and performance counters has yielded worthwhile GPU power models, yet the complexity inherent to GPU architectures presents new hurdles for power modeling. Developments and challenges of counter-based GPU power modeling is discussed. Often building on the counter-based models, research efforts for GPU power simulation, which make power predictions from input code and hardware knowledge, provide opportunities for optimization in programming or architectural design. Noteworthy strides in power simulations for GPUs are included along with their performance or functional simulator counterparts when appropriate. Lastly, possible directions for future research are discussed.},
doi = {10.1145/2962131},
journal = {ACM Computing Surveys},
number = 3,
volume = 49,
place = {United States},
year = {Thu Sep 01 00:00:00 EDT 2016},
month = {Thu Sep 01 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 62 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Power and Energy Profiling of Scientific Applications on Distributed Systems
conference, April 2005

  • Feng, Xizhou; Ge, Rong; Cameron, K. W.
  • 19th IEEE International Parallel and Distributed Processing Symposium
  • DOI: 10.1109/IPDPS.2005.346

A Survey of Methods for Analyzing and Improving GPU Energy Efficiency
journal, August 2014

  • Mittal, Sparsh; Vetter, Jeffrey S.
  • ACM Computing Surveys, Vol. 47, Issue 2
  • DOI: 10.1145/2636342

A tutorial on support vector regression
journal, August 2004


Run-time power estimation in high performance microprocessors
conference, January 2001

  • Joseph, Russ; Martonosi, Margaret
  • Proceedings of the 2001 international symposium on Low power electronics and design - ISLPED '01
  • DOI: 10.1145/383082.383119

ATTILA: a cycle-level execution-driven simulator for modern GPU architectures
conference, January 2006

  • del Barrio, V. M.; Gonzalez, C.; Roca, J.
  • 2006 IEEE International Symposium on Performance Analysis of Systems and Software
  • DOI: 10.1109/ISPASS.2006.1620807

An integrated GPU power and performance model
journal, June 2010


Rodinia: A benchmark suite for heterogeneous computing
conference, October 2009

  • Che, Shuai; Boyer, Michael; Meng, Jiayuan
  • 2009 IEEE International Symposium on Workload Characterization (IISWC)
  • DOI: 10.1109/IISWC.2009.5306797

PowerMon: Fine-grained and integrated power monitoring for commodity computer systems
conference, March 2010

  • Bedard, Daniel; Lim, Min Yeol; Fowler, Robert
  • Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon)
  • DOI: 10.1109/SECON.2010.5453824

Microarchitectural Design Space Exploration Using an Architecture-Centric Approach
conference, December 2007

  • Dubach, Christophe; Jones, Timothy; O'Boyle, Michael
  • 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)
  • DOI: 10.1109/MICRO.2007.12

The Scalable Heterogeneous Computing (SHOC) benchmark suite
conference, January 2010

  • Danalis, Anthony; Marin, Gabriel; McCurdy, Collin
  • Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units - GPGPU '10
  • DOI: 10.1145/1735688.1735702

GROPHECY: GPU performance projection from CPU code skeletons
conference, January 2011

  • Meng, Jiayuan; Morozov, Vitali A.; Kumaran, Kalyan
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • DOI: 10.1145/2063384.2063402

Multi2Sim: a simulation framework for CPU-GPU computing
conference, January 2012

  • Ubal, Rafael; Jang, Byunghyun; Mistry, Perhaad
  • Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12
  • DOI: 10.1145/2370816.2370865

Real time power estimation and thread scheduling via performance counters
journal, May 2009

  • Singh, Karan; Bhadauria, Major; McKee, Sally A.
  • ACM SIGARCH Computer Architecture News, Vol. 37, Issue 2
  • DOI: 10.1145/1577129.1577137

Statistical power modeling of GPU kernels using performance counters
conference, August 2010


NUPAR: A Benchmark Suite for Modern GPU Architectures
conference, January 2015

  • Ukidave, Yash; Kaeli, David; Paravecino, Fanny Nina
  • Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering - ICPE '15
  • DOI: 10.1145/2668930.2688046

DRAMSim2: A Cycle Accurate Memory System Simulator
journal, January 2011

  • Rosenfeld, P.; Cooper-Balis, E.; Jacob, B.
  • IEEE Computer Architecture Letters, Vol. 10, Issue 1
  • DOI: 10.1109/L-CA.2011.4

PAPI 5: Measuring power, energy, and the cloud
conference, April 2013

  • Weaver, Vincent M.; Terpstra, Dan; McCraw, Heike
  • 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
  • DOI: 10.1109/ISPASS.2013.6557155

PowerInsight - A commodity power measurement capability
conference, June 2013

  • Laros, James H.; Pokorny, Phil; DeBonis, David
  • 2013 International Green Computing Conference (IGCC), 2013 International Green Computing Conference Proceedings
  • DOI: 10.1109/IGCC.2013.6604485

Run-time modeling and estimation of operating system power consumption
journal, June 2003

  • Li, Tao; John, Lizy Kurian
  • ACM SIGMETRICS Performance Evaluation Review, Vol. 31, Issue 1
  • DOI: 10.1145/885651.781048

Temperature-aware microarchitecture
conference, January 2003

  • Skadron, Kevin; Stan, Mircea R.; Huang, Wei
  • Proceedings of the 30th annual international symposium on Computer architecture - ISCA '03
  • DOI: 10.1145/859618.859620

A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures
conference, May 2013

  • Song, Shuaiwen; Su, Chunyi; Rountree, Barry
  • 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
  • DOI: 10.1109/IPDPS.2013.73

How GPUs Work
journal, February 2007


UNISIM: An Open Simulation Environment and Library for Complex Architecture Design and Collaborative Development
journal, February 2007

  • August, D.; Chang, J.; Girbal, S.
  • IEEE Computer Architecture Letters, Vol. 6, Issue 2
  • DOI: 10.1109/L-CA.2007.12

Weak execution ordering - exploiting iterative methods on many-core GPUs
conference, March 2010

  • Chen, Jianmin; Huang, Zhuo; Su, Feiqi
  • 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS)
  • DOI: 10.1109/ISPASS.2010.5452028

Fine-grained graphics architectural simulation with Qsilver
conference, January 2005

  • Sheaffer, Jeremy W.; Skadron, Kevin; Luebke, David P.
  • ACM SIGGRAPH 2005 Posters on - SIGGRAPH '05
  • DOI: 10.1145/1186954.1187089

Merge: a programming model for heterogeneous multi-core systems
conference, January 2008

  • Linderman, Michael D.; Collins, Jamison D.; Wang, Hong
  • Proceedings of the 13th international conference on Architectural support for programming languages and operating systems - ASPLOS XIII
  • DOI: 10.1145/1346281.1346318

Analyzing CUDA workloads using a detailed GPU simulator
conference, April 2009

  • Bakhoda, Ali; Yuan, George L.; Fung, Wilson W. L.
  • Software (ISPASS), 2009 IEEE International Symposium on Performance Analysis of Systems and Software
  • DOI: 10.1109/ISPASS.2009.4919648

PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
journal, May 2010

  • Ge, Rong; Feng, Xizhou; Song, Shuaiwen
  • IEEE Transactions on Parallel and Distributed Systems, Vol. 21, Issue 5
  • DOI: 10.1109/TPDS.2009.76

RAPL: memory power estimation and capping
conference, January 2010

  • David, Howard; Gorbatov, Eugene; Hanebutte, Ulf R.
  • Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design - ISLPED '10
  • DOI: 10.1145/1840845.1840883

Power Aware Computing on GPUs
conference, July 2012

  • Kasichayanula, Kiran; Terpstra, Dan; Luszczek, Piotr
  • 2012 Symposium on Application Accelerators in High Performance Computing (SAAHPC)
  • DOI: 10.1109/SAAHPC.2012.26

POIGEM: A Programming-Oriented Instruction Level GPU Energy Model for CUDA Program
book, January 2013


An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
conference, January 2009

  • Hong, Sunpyo; Kim, Hyesoon
  • Proceedings of the 36th annual international symposium on Computer architecture - ISCA '09
  • DOI: 10.1145/1555754.1555775

An Instruction-Level Energy Estimation and Optimization Methodology for GPU
conference, August 2011

  • Wang, Yue; Ranganathan, Nagarajan
  • 2011 IEEE 11th International Conference on Computer and Information Technology (CIT)
  • DOI: 10.1109/CIT.2011.69

Runtime power monitoring in high-end processors: methodology and empirical data
conference, January 2003

  • Isci, C.; Martonosi, M.
  • 36th International Symposium on Microarchitecture, 22nd Digital Avionics Systems Conference. Proceedings (Cat. No.03CH37449)
  • DOI: 10.1109/MICRO.2003.1253186

Studying Thermal Management for Graphics-Processor Architectures
conference, January 2005

  • Sheaffer, J. W.; Skadron, K.; Luebke, D. P.
  • IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005.
  • DOI: 10.1109/ISPASS.2005.1430559

A performance analysis framework for identifying potential benefits in GPGPU applications
conference, January 2012

  • Sim, Jaewoong; Dasgupta, Aniruddha; Kim, Hyesoon
  • Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming - PPoPP '12
  • DOI: 10.1145/2145816.2145819

GPUWattch: enabling energy optimizations in GPGPUs
conference, January 2013

  • Leng, Jingwen; Hetherington, Tayler; ElTantawy, Ahmed
  • Proceedings of the 40th Annual International Symposium on Computer Architecture - ISCA '13
  • DOI: 10.1145/2485922.2485964

Perfmon2: a leap forward in performance monitoring
journal, July 2008


GPU Performance and Power Tuning Using Regression Trees
journal, July 2015

  • Jia, Wenhao; Garza, Elba; Shaw, Kelly A.
  • ACM Transactions on Architecture and Code Optimization, Vol. 12, Issue 2
  • DOI: 10.1145/2736287

A flexible simulation framework for graphics architectures
conference, January 2004

  • Sheaffer, J. W.; Luebke, D.; Skadron, K.
  • Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware - HWWS '04
  • DOI: 10.1145/1058129.1058142

Barra: A Parallel Functional Simulator for GPGPU
conference, August 2010

  • Collange, Sylvain; Daumas, Marc; Defour, David
  • Simulation of Computer and Telecommunication Systems (MASCOTS), 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
  • DOI: 10.1109/MASCOTS.2010.43

Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound
conference, May 2012

  • Rountree, Barry; Ahn, Dong H.; de Supinski, Bronis R.
  • 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
  • DOI: 10.1109/IPDPSW.2012.116

On the energy efficiency of graphics processing units for scientific computing
conference, May 2009

  • Huang, S.; Xiao, S.; Feng, W.
  • Distributed Processing (IPDPS), 2009 IEEE International Symposium on Parallel & Distributed Processing
  • DOI: 10.1109/IPDPS.2009.5160980

Measuring GPU Power with the K20 Built-in Sensor
conference, January 2014

  • Burtscher, Martin; Zecena, Ivan; Zong, Ziliang
  • Proceedings of Workshop on General Purpose Processing Using GPUs - GPGPU-7
  • DOI: 10.1145/2588768.2576783

Random Forests
journal, January 2001


Statistical GPU power analysis using tree-based methods
conference, July 2011

  • Chen, Jianmin; Li, Bin; Zhang, Ying
  • 2011 International Green Computing Conference and Workshops
  • DOI: 10.1109/IGCC.2011.6008582

Lonestar: A suite of parallel irregular programs
conference, April 2009

  • Kulkarni, Milind; Burtscher, Martin; Cascaval, Calin
  • Software (ISPASS), 2009 IEEE International Symposium on Performance Analysis of Systems and Software
  • DOI: 10.1109/ISPASS.2009.4919639

Flexible software profiling of GPU architectures
conference, January 2015

  • Stephenson, Mark; Sastry Hari, Siva Kumar; Lee, Yunsup
  • Proceedings of the 42nd Annual International Symposium on Computer Architecture - ISCA '15
  • DOI: 10.1145/2749469.2750375

The benefits of event: driven energy accounting in power-sensitive systems
conference, January 2000

  • Bellosa, Frank
  • Proceedings of the 9th workshop on ACM SIGOPS European workshop beyond the PC: new challenges for the operating system - EW 9
  • DOI: 10.1145/566726.566736

Wattch: a framework for architectural-level power analysis and optimizations
journal, May 2000

  • Brooks, David; Tiwari, Vivek; Martonosi, Margaret
  • ACM SIGARCH Computer Architecture News, Vol. 28, Issue 2
  • DOI: 10.1145/342001.339657

Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
conference, December 2010

  • Chung, Eric S.; Milder, Peter A.; Hoe, James C.
  • 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
  • DOI: 10.1109/MICRO.2010.36

An analytical approach to computing biomolecular electrostatic potential. II. Validation and applications
journal, August 2008

  • Gordon, John C.; Fenley, Andrew T.; Onufriev, Alexey
  • The Journal of Chemical Physics, Vol. 129, Issue 7
  • DOI: 10.1063/1.2956499

An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
journal, June 2009


McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures
conference, January 2009

  • Li, Sheng; Ahn, Jung Ho; Strong, Richard D.
  • Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture - Micro-42
  • DOI: 10.1145/1669112.1669172

Merge: a programming model for heterogeneous multi-core systems
journal, March 2008

  • Linderman, Michael D.; Collins, Jamison D.; Wang, Hong
  • ACM SIGOPS Operating Systems Review, Vol. 42, Issue 2
  • DOI: 10.1145/1353535.1346318

A performance analysis framework for identifying potential benefits in GPGPU applications
journal, September 2012

  • Sim, Jaewoong; Dasgupta, Aniruddha; Kim, Hyesoon
  • ACM SIGPLAN Notices, Vol. 47, Issue 8
  • DOI: 10.1145/2370036.2145819

Temperature-aware microarchitecture
journal, May 2003

  • Skadron, Kevin; Stan, Mircea R.; Huang, Wei
  • ACM SIGARCH Computer Architecture News, Vol. 31, Issue 2
  • DOI: 10.1145/871656.859620

The Optimist, the Pessimist, and the Global Race to Exascale in 20 Megawatts
journal, January 2012


An Unsolvable Problem of Elementary Number Theory
journal, April 1936

  • Church, Alonzo
  • American Journal of Mathematics, Vol. 58, Issue 2
  • DOI: 10.2307/2371045

An integrated GPU power and performance model
conference, January 2010

  • Hong, Sunpyo; Kim, Hyesoon
  • Proceedings of the 37th annual international symposium on Computer architecture - ISCA '10
  • DOI: 10.1145/1815961.1815998

Temperature-aware microarchitecture: Modeling and implementation
journal, March 2004

  • Skadron, Kevin; Stan, Mircea R.; Sankaranarayanan, Karthik
  • ACM Transactions on Architecture and Code Optimization, Vol. 1, Issue 1
  • DOI: 10.1145/980152.980157

Qualification for PowerInsight accuracy of power measurements
report, November 2013


miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides
journal, September 2020

  • Meher, Prabina Kumar; Satpathy, Subhrajit; Rao, Atmakuri Ramakrishna
  • Scientific Reports, Vol. 10, Issue 1
  • DOI: 10.1038/s41598-020-71381-4

Microarchitectural Design Space Exploration Using an Architecture-Centric Approach
conference, December 2007

  • Dubach, Christophe; Jones, Timothy; O'Boyle, Michael
  • 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007)
  • DOI: 10.1109/micro.2007.4408261

Flexible software profiling of GPU architectures
journal, January 2016

  • Stephenson, Mark; Sastry Hari, Siva Kumar; Lee, Yunsup
  • ACM SIGARCH Computer Architecture News, Vol. 43, Issue 3S
  • DOI: 10.1145/2872887.2750375

Run-time modeling and estimation of operating system power consumption
conference, January 2003

  • Li, Tao; John, Lizy Kurian
  • Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '03
  • DOI: 10.1145/781027.781048

Measuring GPU Power with the K20 Built-in Sensor
conference, October 2018

  • Burtscher, Martin; Zecena, Ivan; Zong, Ziliang
  • GPGPU-7: Seventh Workshop on General Purpose Processing Using GPUs, Proceedings of Workshop on General Purpose Processing Using GPUs
  • DOI: 10.1145/2576779.2576783

Temperature-Aware Microarchitecture
report, January 2003

  • Skadron, Kevin; Stan, Mircea; Huang, Wei
  • University of Virginia, Department of Computer Science
  • DOI: 10.18130/v31b7q

Merge: a programming model for heterogeneous multi-core systems
journal, March 2008

  • Linderman, Michael D.; Collins, Jamison D.; Wang, Hong
  • ACM SIGARCH Computer Architecture News, Vol. 36, Issue 1
  • DOI: 10.1145/1353534.1346318

CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography
conference, November 2007


The basics of performance-monitoring hardware
journal, July 2002


Works referencing / citing this record:

Augmenting High-Performance Mobile Cloud Computations for Big Data in AMBER
journal, January 2018

  • Iqbal, Muhammad Munwar; Ali, Muhammad; Alfawair, Mai
  • Wireless Communications and Mobile Computing, Vol. 2018
  • DOI: 10.1155/2018/4796535