Buffered coscheduling for parallel programming and enhanced fault tolerance
- Los Alamos, NM
A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval. The buffered coscheduling method of this invention also enhances the fault tolerance of a network of parallel machine processors or distributed system processors
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-36
- Assignee:
- The Regents of the University of California (Los Alamos, NM)
- Patent Number(s):
- 6,993,764
- Application Number:
- 09/895,570
- OSTI ID:
- 908604
- Country of Publication:
- United States
- Language:
- English
Scheduling with implicit information in distributed systems
|
journal | June 1998 |
All-to-all personalized communication in a wormhole-routed torus
|
journal | May 1996 |
Simultaneous multithreading: a platform for next-generation processors
|
journal | September 1997 |
Concurrent event handling through multithreading
|
journal | September 1999 |
Similar Records
System-level fault-tolerance in large-scale parallel machines with buffered coscheduling
Adaptive Parallel Job Scheduling with Flexible CoScheduling