Implementing Arbitrary/Common Concurrent Writes of CRCW PRAM
- ORNL
The Parallel Random Access Machines (PRAM) abstraction is the simplest and most elegant algorithmic model for the design and analysis of parallel algorithms. It consists of different models categorized based on the underlying memory access mode used, the most powerful of which is the Concurrent Read Concurrent Write (CRCW) model. A PRAM algorithm describes a series of rounds, each of which consists of a collection of operations that can be executed concurrently within the same time step. However, the lack of support for concurrent memory accesses and the prevalence of asynchronous programming models led to the belief that implementing CRCW PRAM algorithms is unattainable and prompted many to avoid this model except for theoretical studies of optimal performance.In this work, we study the arbitrary and common concurrent writes in the CRCW PRAM model and explore implementation challenges on general-purpose systems. Moreover, we examine current practices for implementing common/arbitrary concurrent writes and propose a new efficient lightweight and thread-safe method to implement concurrent writes through leveraging atomic instructions. To demonstrate the efficacy of our method, we developed OpenMP kernels for classical CRCW PRAM algorithms and provide experimental results and comparisons based on run time performance measured over the x86 multicore architecture. Our results show a performance speedup compared to current practices up to 4.5x across all our benchmarks.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1856720
- Country of Publication:
- United States
- Language:
- English
Similar Records
A lower bound for the QRQW PRAM
Recursive star-tree parallel data structure. Technical report
Parallel algorithms with processor failures and delays. Technical report
Conference
·
Tue May 02 00:00:00 EDT 1995
·
OSTI ID:93994
Recursive star-tree parallel data structure. Technical report
Technical Report
·
Wed Feb 28 23:00:00 EST 1990
·
OSTI ID:6088764
Parallel algorithms with processor failures and delays. Technical report
Technical Report
·
Thu Aug 01 00:00:00 EDT 1991
·
OSTI ID:5796465