Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Hardware Transactional Memory for GPU Architectures Wilson W. L. Fung

Summary: Hardware Transactional Memory for GPU Architectures
Wilson W. L. Fung
Inderpreet Singh
Andrew Brownsword Tor M. Aamodt
Department of Computer and Electrical Engineering

University of British Columbia
wwlfung@ece.ubc.ca isingh@ece.ubc.ca
andrew@brownsword.ca aamodt@ece.ubc.ca
Graphics processor units (GPUs) are designed to efficiently exploit
thread level parallelism (TLP), multiplexing execution of 1000s of
concurrent threads on a relatively smaller set of single-instruction,
multiple-thread (SIMT) cores to hide various long latency opera-
tions. While threads within a CUDA block/OpenCL workgroup can
communicate efficiently through an intra-core scratchpad memory,
threads in different blocks can only communicate via global mem-
ory accesses. Programmers wishing to exploit such communication
have to consider data-races that may occur when multiple threads
modify the same memory location. Recent GPUs provide a form


Source: Aamodt, Tor - Department of Electrical and Computer Engineering, University of British Columbia


Collections: Engineering; Computer Technologies and Information Sciences