GPU cache management based on locality type detection
Abstract
Wavefront loading in a processor is managed and includes monitoring a selected wavefront of a set of wavefronts. Reuse of memory access requests for the selected wavefront is counted. A cache hit rate in one or more caches of the processor is determined based on the counted reuse. Based on the cache hit rate, subsequent memory requests of other wavefronts of the set of wavefronts are modified by including a type of reuse of cache lines in requests to the caches. In the caches, storage of data in the caches is based on the type of reuse indicated by the subsequent memory access requests. Reused cache lines are protected by preventing cache line contents from being replaced by another cache line for a duration of processing the set of wavefronts. Caches are bypassed when streaming access requests are made.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States); Advanced Micro Devices, Inc., Santa Clara, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1986767
- Patent Number(s):
- 11487671
- Application Number:
- 16/446,119
- Assignee:
- Advanced Micro Devices, Inc. (Santa Clara, CA)
- DOE Contract Number:
- AC52-07NA27344; B620717
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 06/19/2019
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Zhang, Xianwei, Kalamatianos, John, and Beckmann, Bradford. GPU cache management based on locality type detection. United States: N. p., 2022.
Web.
Zhang, Xianwei, Kalamatianos, John, & Beckmann, Bradford. GPU cache management based on locality type detection. United States.
Zhang, Xianwei, Kalamatianos, John, and Beckmann, Bradford. Tue .
"GPU cache management based on locality type detection". United States. https://www.osti.gov/servlets/purl/1986767.
@article{osti_1986767,
title = {GPU cache management based on locality type detection},
author = {Zhang, Xianwei and Kalamatianos, John and Beckmann, Bradford},
abstractNote = {Wavefront loading in a processor is managed and includes monitoring a selected wavefront of a set of wavefronts. Reuse of memory access requests for the selected wavefront is counted. A cache hit rate in one or more caches of the processor is determined based on the counted reuse. Based on the cache hit rate, subsequent memory requests of other wavefronts of the set of wavefronts are modified by including a type of reuse of cache lines in requests to the caches. In the caches, storage of data in the caches is based on the type of reuse indicated by the subsequent memory access requests. Reused cache lines are protected by preventing cache line contents from being replaced by another cache line for a duration of processing the set of wavefronts. Caches are bypassed when streaming access requests are made.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2022},
month = {11}
}
Works referenced in this record:
An efficient compiler framework for cache bypassing on GPUs
conference, November 2013
- Xie, Xiaolong; Liang, Yun; Sun, Guangyu
- 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
Issue Control for Multithreaded Processing
patent-application, April 2016
- Sethia, Ankit; Mahlke, Scott
- US Patent Application 14/510482; 20160103715
Priority-based cache allocation in throughput processors
conference, February 2015
- Li, Dong; Rhu, Minsoo; Johnson, Daniel R.
- 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)
Cache utilization and eviction based on allocated priority tokens
patent, October 2016
- Johnson, Daniel R.; Rhu, Minsoo; O'Connor, James M.
- US Patent Document 9,477,526
Access Pattern-Aware Cache Management for Improving Data Utilization in GPU
conference, June 2017
- Koo, Gunjae; Oh, Yunho; Ro, Won Woo
- Proceedings of the 44th Annual International Symposium on Computer Architecture
Adaptive Cache Management for Energy-Efficient GPU Computing
conference, December 2014
- Chen, Xuhao; Chang, Li-Wen; Rodrigues, Christopher I.
- 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture
Issue control for multithreaded processing
patent, February 2018
- Sethia, Ankit; Mahlke, Scott
- US Patent Document 9,898,409
Scheduling Method and Processing Device Using the Same
patent-application, May 2017
- Chen, Heng-Yi; Chen, Chung-Ho; Wang, Chen-Chieh
- US Patent Application 14/983086; 20170139751
Adaptive GPU cache bypassing
conference, February 2015
- Tian, Yingying; Puthoor, Sooraj; Greathouse, Joseph L.
- Proceedings of the 8th Workshop on General Purpose Processing using GPUs
Systems and Methods for Provisioning of Storage for Virtualized Applications
patent-application, May 2014
- Guha, Aloke
- US Patent Application 13/767829; 20140130055
Computing System and Method for Processing Operations Thereof
patent-application, March 2017
- Choi, Yoonseo
- US Patent Application 15/067494; 20170060588
Techniques for Accessing Content-Addressable Memory
patent-application, June 2014
- Fahs, Brian; Anderson, Eric T.; Barrow-Williams, Nick
- US Patent Application 13/720755; 20140173193
Cache access detection and prediction
patent, July 2020
- Ma, Lei; Hornung, Alexander Alfred; Caulfield, Ian Michael
- US Patent Document 10,725,923
Space/time trade-offs in hash coding with allowable errors
journal, July 1970
- Bloom, Burton H.
- Communications of the ACM, Vol. 13, Issue 7, p. 422-426