Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Sunder: A programmable hardware prefetch architecture for numerical loops

Conference ·
OSTI ID:87653
 [1]
  1. State Univ. of New York, Stony Brook, NY (United States). Computer Science Dept.

Beyond data caching, data prefetching is by far the most effective way to address the memory access bottleneck associated with high-performance processors. This is particularly true for scientific programs whose working sets cannot be easily fit into the on-chip data cache. This paper proposes a new data prefetching architecture called Sunder, which combines the flexibility and accurateness of software prefetching and the transparency and low-overhead of hardware prefetching. The heart of the design is a dedicated Prefetch Engine that is programmable at run time by the software. An important design decision is to keep the Prefetch Engine completely isolated from the normal instruction execution pipeline except a loop counter to keep the two synchronized at the boundaries of loop iterations. A detailed simulation study on the Sunder architecture shows that compared to the cache-only architecture, Sunder achieves an average relative performance advantage over cache-only architectures ranging from 28% to 46%, with smaller cache block sizes leading to greater performance improvement.

OSTI ID:
87653
Report Number(s):
CONF-941118--; ISBN 0-8186-6605-6
Country of Publication:
United States
Language:
English

Similar Records

Compiler optimization technique for data cache prefetching using a small CAM array
Conference · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:100196

Data prefetching in shared memory multiprocessors
Conference · Wed Dec 31 23:00:00 EST 1986 · OSTI ID:5703538

Programmable stream prefetch with resource optimization
Patent · Mon Jan 07 23:00:00 EST 2013 · OSTI ID:1082909