Extreme-bandwidth scalable performance-per-watt GPU architecture
Abstract
A technique for accessing memory in an accelerated processing device coupled to stacked memory dies is provided herein. The technique includes receiving a memory access request from an execution unit and identifying whether the memory access request corresponds to memory cells of the stacked dies that are considered local to the execution unit or non-local. For local accesses, the access is made “directly”, that is, without using a bus. A control die coordinates operations for such local accesses, activating particular through-silicon-vias associated with the memory cells that include the data for the access. Non-local accesses are made via a distributed cache fabric and an interconnect bus in the control die. Various other features and details are provided below.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1600412
- Patent Number(s):
- 10509596
- Application Number:
- 15/851,476
- Assignee:
- Advanced Micro Devices, Inc. (Santa Clara, CA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
G - PHYSICS G06 - COMPUTING G06T - IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- DOE Contract Number:
- AC52-07NA27344; B620717
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 12/21/2017
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Yudanov, Dmitri, and Chen, Jiasheng. Extreme-bandwidth scalable performance-per-watt GPU architecture. United States: N. p., 2019.
Web.
Yudanov, Dmitri, & Chen, Jiasheng. Extreme-bandwidth scalable performance-per-watt GPU architecture. United States.
Yudanov, Dmitri, and Chen, Jiasheng. Tue .
"Extreme-bandwidth scalable performance-per-watt GPU architecture". United States. https://www.osti.gov/servlets/purl/1600412.
@article{osti_1600412,
title = {Extreme-bandwidth scalable performance-per-watt GPU architecture},
author = {Yudanov, Dmitri and Chen, Jiasheng},
abstractNote = {A technique for accessing memory in an accelerated processing device coupled to stacked memory dies is provided herein. The technique includes receiving a memory access request from an execution unit and identifying whether the memory access request corresponds to memory cells of the stacked dies that are considered local to the execution unit or non-local. For local accesses, the access is made “directly”, that is, without using a bus. A control die coordinates operations for such local accesses, activating particular through-silicon-vias associated with the memory cells that include the data for the access. Non-local accesses are made via a distributed cache fabric and an interconnect bus in the control die. Various other features and details are provided below.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {12}
}
Works referenced in this record:
Three-Dimensional Chip-Based Regular Expression Scanner
patent-application, March 2017
- Van Lunteren, Jan; Coghlan, James; Joseph, Douglas J.
- US Patent Application 14/841825; 20170061304
Harmonica: An FPGA-Based Data Parallel Soft Core
conference, May 2014
- Kersey, Chad; Yalamanchili, Sudhakar; Kim, Hyojong
- 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines
Exploring DRAM organizations for energy-efficient and resilient exascale memories
conference, November 2013
- Giridhar, Bharan; Cieslak, Michael; Duggal, Deepankar
- Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
3D-Integrated SRAM Components for High-Performance Microprocessors
journal, October 2009
- Puttaswamy, Kiran; Loh, Gabriel H.
- IEEE Transactions on Computers, Vol. 58, Issue 10
A case for exploiting subarray-level parallelism (SALP) in DRAM
conference, June 2012
- Kim, Yoongu; Seshadri, Vivek; Lee, Donghyuk
- 2012 39th Annual International Symposium on Computer Architecture (ISCA)