skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes

Patent ·
OSTI ID:1016148

In a parallel computing environment comprising a network of SMP nodes each having at least one processor, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations. The method and system uses a global co-scheduler and an operating system kernel dispatcher adapted to coordinate interfering system and daemon activities on a node and across nodes to promote intra-node and inter-node overlap of said interfering system and daemon activities as well as intra-node and inter-node overlap of said synchronizing collective operations. In this manner, the impact of random short-lived interruptions, such as timer-decrement processing and periodic daemon activity, on synchronizing collective operations is minimized on large processor-count SPMD bulk-synchronous programming styles.

Research Organization:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
Assignee:
Lawrence Livermore National Security, LLC (Livermore, CA)
Patent Number(s):
7,810,093
Application Number:
10/989,704
OSTI ID:
1016148
Country of Publication:
United States
Language:
English

References (4)

Effective distributed scheduling of parallel workloads
  • Dusseau, Andrea C.; Arpaci, Remzi H.; Culler, David E.
  • Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '96 https://doi.org/10.1145/233013.233020
conference January 1996
Fast collective operations using shared and remote memory access protocols on clusters
  • Tipparaju, V.; Nieplocha, J.; Panda, D.
  • International Parallel and Distributed Processing Symposium (IPDPS 2003), Proceedings International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2003.1213188
conference January 2003
Operating system support for parallel programming on RP3 journal September 1991
Dynamic coscheduling on workstation clusters book January 1998

Similar Records

ATCOM: Automatically Tuned Collective Communication System for SMP Clusters
Thesis/Dissertation · Sat Jan 01 00:00:00 EST 2005 · OSTI ID:1016148

Reducing communication in algebraic multigrid with multi-step node aware communication
Journal Article · Thu Jun 11 00:00:00 EDT 2020 · International Journal of High Performance Computing Applications · OSTI ID:1016148

Reducing communication in algebraic multigrid with multi-step node aware communication
Journal Article · Thu Jun 11 00:00:00 EDT 2020 · International Journal of High Performance Computing Applications · OSTI ID:1016148

Related Subjects