skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors.

Abstract

Abstract not provided.

Authors:
;  [1];  [1]
  1. (UCF)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1183092
Report Number(s):
SAND2014-16868J
536720
DOE Contract Number:
AC04-94AL85000
Resource Type:
Journal Article
Resource Relation:
Journal Name: International Journal of Parallel Programming
Country of Publication:
United States
Language:
English

Citation Formats

Dechev, Damian, Deli Zhang, and Brendan Lynch. Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors.. United States: N. p., 2014. Web.
Dechev, Damian, Deli Zhang, & Brendan Lynch. Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors.. United States.
Dechev, Damian, Deli Zhang, and Brendan Lynch. Fri . "Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors.". United States. doi:.
@article{osti_1183092,
title = {Queue-Based and Adaptive Lock Algorithms for Scalable Resource Allocation on Shared-Memory Multiprocessors.},
author = {Dechev, Damian and Deli Zhang and Brendan Lynch},
abstractNote = {Abstract not provided.},
doi = {},
journal = {International Journal of Parallel Programming},
number = ,
volume = ,
place = {United States},
year = {Fri Aug 01 00:00:00 EDT 2014},
month = {Fri Aug 01 00:00:00 EDT 2014}
}
  • Abstract not provided.
  • Knowing the right type of locking algorithm to use when multiple processes contend for a single lock can prevent performance degradation in shared-memory multiprocessor systems.
  • In recent years, parallel processing has been applied for time domain simulations of power system transient behavior in order to implement real-time Dynamic Security Assessment. In this paper, two different algorithms have been implemented and compared: the Shifted-Picard (SP) and the Very DisHonest Newton (VDHN). The former has been proved to be effective when parallelism-in-time is adopted whereas the latter is an effective solver when parallelism-in-space is exploited. Furthermore, two different parallel computing architectures have been considered: namely, the Sequent Symmetry computer with 26 processors which is a data shared memory machine and the nCUBE characterized by 128 CPUs whichmore » is a typical message passing parallel machine. A realistic network with 662 buses has been used to assess the performance of the different implementations. The comparison of the results allows the reader to understand both the limitations of the algorithmic approaches and the constraints imposed by the two parallel architectures. An optimal grain of the parallelism associated to the problem can be identified through the reported experience.« less
  • This paper presents a parallel sparse Cholesky factorization algorithm for shared-memory MIMD multiprocessors. The algorithm is particularly well suited for vector supercomputers with multiple processors, such as the Cray Y-MP. The new algorithm is a straightforward parallelization of the left-looking supernodal sparse Cholesky factorization algorithm. Like its sequential predecessor, it improves performance by reducing indirect addressing and memory traffic. Experimental results on a Cray Y-MP demonstrate the effectiveness of the new algorithm. On eight processors of a Cray Y-MP, the new routine performs the factorization at rates exceeding one Gflop for several test problems from the Harwell-Boeing sparse matrix collection.
  • Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2)more » force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.« less