Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes
Abstract
In a parallel computing environment comprising a network of SMP nodes each having at least one processor, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations. The method and system uses a global co-scheduler and an operating system kernel dispatcher adapted to coordinate interfering system and daemon activities on a node and across nodes to promote intra-node and inter-node overlap of said interfering system and daemon activities as well as intra-node and inter-node overlap of said synchronizing collective operations. In this manner, the impact of random short-lived interruptions, such as timer-decrement processing and periodic daemon activity, on synchronizing collective operations is minimized on large processor-count SPMD bulk-synchronous programming styles.
- Inventors:
-
- Livermore, CA
- Kingston, NY
- Austin, TX
- Saugerties, NY
- Issue Date:
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1016148
- Patent Number(s):
- 7810093
- Application Number:
- 10/989,704
- Assignee:
- Lawrence Livermore National Security, LLC (Livermore, CA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- W-7405-ENG-48
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Jones, Terry R, Watson, Pythagoras C, Tuel, William, Brenner, Larry, Caffrey, Patrick, and Fier, Jeffrey. Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes. United States: N. p., 2010.
Web.
Jones, Terry R, Watson, Pythagoras C, Tuel, William, Brenner, Larry, Caffrey, Patrick, & Fier, Jeffrey. Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes. United States.
Jones, Terry R, Watson, Pythagoras C, Tuel, William, Brenner, Larry, Caffrey, Patrick, and Fier, Jeffrey. Tue .
"Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes". United States. https://www.osti.gov/servlets/purl/1016148.
@article{osti_1016148,
title = {Parallel-aware, dedicated job co-scheduling within/across symmetric multiprocessing nodes},
author = {Jones, Terry R and Watson, Pythagoras C and Tuel, William and Brenner, Larry and Caffrey, Patrick and Fier, Jeffrey},
abstractNote = {In a parallel computing environment comprising a network of SMP nodes each having at least one processor, a parallel-aware co-scheduling method and system for improving the performance and scalability of a dedicated parallel job having synchronizing collective operations. The method and system uses a global co-scheduler and an operating system kernel dispatcher adapted to coordinate interfering system and daemon activities on a node and across nodes to promote intra-node and inter-node overlap of said interfering system and daemon activities as well as intra-node and inter-node overlap of said synchronizing collective operations. In this manner, the impact of random short-lived interruptions, such as timer-decrement processing and periodic daemon activity, on synchronizing collective operations is minimized on large processor-count SPMD bulk-synchronous programming styles.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Oct 05 00:00:00 EDT 2010},
month = {Tue Oct 05 00:00:00 EDT 2010}
}
Works referenced in this record:
Effective distributed scheduling of parallel workloads
conference, January 1996
- Dusseau, Andrea C.; Arpaci, Remzi H.; Culler, David E.
- Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '96
Fast collective operations using shared and remote memory access protocols on clusters
conference, January 2003
- Tipparaju, V.; Nieplocha, J.; Panda, D.
- International Parallel and Distributed Processing Symposium (IPDPS 2003), Proceedings International Parallel and Distributed Processing Symposium
Operating system support for parallel programming on RP3
journal, September 1991
- Bryant, R. M.; Chang, H. -Y.; Rosenburg, B. S.
- IBM Journal of Research and Development, Vol. 35, Issue 5.6
Dynamic coscheduling on workstation clusters
book, January 1998
- Sobalvarro, Patrick G.; Pakin, Scott; Weihl, William E.
- Job Scheduling Strategies for Parallel Processing