Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Hardware Transactional Memory for GPU Architectures Wilson W. L. Fung
 

Summary: Hardware Transactional Memory for GPU Architectures
Wilson W. L. Fung
Inderpreet Singh
Andrew Brownsword Tor M. Aamodt
Department of Computer and Electrical Engineering

University of British Columbia
wwlfung@ece.ubc.ca isingh@ece.ubc.ca
andrew@brownsword.ca aamodt@ece.ubc.ca
ABSTRACT
Graphics processor units (GPUs) are designed to efficiently exploit
thread level parallelism (TLP), multiplexing execution of 1000s of
concurrent threads on a relatively smaller set of single-instruction,
multiple-thread (SIMT) cores to hide various long latency opera-
tions. While threads within a CUDA block/OpenCL workgroup can
communicate efficiently through an intra-core scratchpad memory,
threads in different blocks can only communicate via global mem-
ory accesses. Programmers wishing to exploit such communication
have to consider data-races that may occur when multiple threads
modify the same memory location. Recent GPUs provide a form

  

Source: Aamodt, Tor - Department of Electrical and Computer Engineering, University of British Columbia

 

Collections: Engineering; Computer Technologies and Information Sciences