Collective Memory Transfers for Multi-Core Chips

Michelogiannakis, George; Williams, Alexander; Shalf, John

doi:10.2172/1164908

Title: Collective Memory Transfers for Multi-Core Chips

Technical Report · Wed Nov 13 00:00:00 EST 2013

DOI:https://doi.org/10.2172/1164908· OSTI ID:1164908

Michelogiannakis, George; Williams, Alexander; Shalf, John

Future performance improvements for microprocessors have shifted from clock frequency scaling towards increases in on-chip parallelism. Performance improvements for a wide variety of parallel applications require domain-decomposition of data arrays from a contiguous arrangement in memory to a tiled layout for on-chip L1 data caches and scratchpads. How- ever, DRAM performance suffers under the non-streaming access patterns generated by many independent cores. We propose collective memory scheduling (CMS) that actively takes control of collective memory transfers such that requests arrive in a sequential and predictable fashion to the memory controller. CMS uses the hierarchically tiled arrays formal- ism to compactly express collective operations, which greatly improves programmability over conventional prefetch or list- DMA approaches. CMS reduces application execution time by up to 32% and DRAM read power by 2.2×, compared to a baseline DMA architecture such as STI Cell.

View Technical Report

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC)

DOE Contract Number:: DE-AC02-05CH11231

OSTI ID:: 1164908

Report Number(s):: LBNL-6485E

Country of Publication:: United States

Language:: English

Similar Records

Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture

Conference · Mon Jan 12 00:00:00 EST 2009 · OSTI ID:1164908

Gebis, Joseph; Oliker, Leonid; Shalf, John; +2 more

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Journal Article · Tue Jan 03 00:00:00 EST 2017 · Journal of Parallel and Distributed Computing · OSTI ID:1164908

Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.; +7 more

The SPUR instruction unit: An on-chip instruction cache memory for a high performance VLSI multiprocessor

Book · Thu Jan 01 00:00:00 EST 1987 · OSTI ID:1164908

Duncombe, R R

Related Subjects

97 MATHEMATICS AND COMPUTING
DRAM
access stream
stencils
memory bandwidth
collective transfers

Title: Collective Memory Transfers for Multi-Core Chips

Citation Formats

Similar Records

Related Subjects