RACB: Resource Aware Cache Bypass on GPUs
Conference
·
· 2014 International Symposium on Computer Architecture and High Performance Computing Workshop; 22-24 Oct. 2014; Paris, France
Caches are universally used in computing systems to hide long off-chip memory access latencies. Unlike CPUs, massive threads running simultaneously on GPUs bring a tremendous pressure on memory hierarchy. As a result, the limitation of cache resources becomes a bottleneck for a GPU to exploit thread-level parallelism (TLP) and memory-level parallelism (MLP) and achieve high performance. In this paper, we propose a mechanism to bypass L1D and L2 cache based on the availability of cache resources. Our proposed mechanism is based on the observation that a huge number of stalls coming from limited cache resources prohibit GPUs from providing a higher throughput. So we propose Resource Aware Cache Bypass (RACB) with minor hardware changes to eliminate such stalls to improve performance. We examine the effectiveness of this approach when applied to L1D and L2 cache separately as well as together. Evaluation results with NVIDIA Computing SDK show that RACB generally improves performance the most when applied to both L1D and L2 cache, which is up to 88.05% and on an average of 16.73%, additionally, energy is saved up to 22.35% and on an average of 5.88% with minor hardware overheads.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science; USDOE
- OSTI ID:
- 1567596
- Conference Information:
- Journal Name: 2014 International Symposium on Computer Architecture and High Performance Computing Workshop; 22-24 Oct. 2014; Paris, France
- Country of Publication:
- United States
- Language:
- English
Similar Records
Locality-Driven Dynamic GPU Cache Bypassing
Dynamic cache bypassing
A performance model for GPUs with caches
Conference
·
Sun Jun 07 00:00:00 EDT 2015
·
OSTI ID:1194296
Dynamic cache bypassing
Patent
·
Tue Mar 24 00:00:00 EDT 2020
·
OSTI ID:1637804
A performance model for GPUs with caches
Journal Article
·
Mon Jun 23 20:00:00 EDT 2014
· IEEE Transactions on Parallel and Distributed Systems
·
OSTI ID:1333005