Co-Ordinated Coscheduling in Clusters through a Generic Framework
Communication-driven scheduling is known to be an effective technique to improve the performance of parallel workloads in time-sharing clusters. Although several such coscheduling algorithms have been proposed, to our knowledge, none of these techniques have been adopted in commercial systems. We believe this is primarily because many of these algorithms has not been exhaustively tested on real systems in presence of mixed workloads, and hence, have not been demonstrated as a favorable alternative to the traditional, batch scheduling. Moreover, practical issues like lack of a methodological approach to efficiently implement, port or reuse the necessary software have dissuaded designers from including coscheduling as a feature in the mainstream system software layer. In this paper, we attempt to fill these crucial voids by addressing several key issues. First, we propose a generic framework for deploying coscheduling techniques by providing a reusable and dynamically loadable kernel module. Second, we implement three prior dynamic coscheduling algorithms (Dynamic coscheduling (DCS), Spin Block (SB) and Periodic Boost (PB)) and a new coscheduling technique, called Co-ordinated coscheduling (CC), using the above framework. Then, we demonstrate the effectiveness of these strategies by implementing a prototype on a Myrinet connected 16-node Linux cluster that uses industry standard Virtual Interface Architecture (VIA) as the user-level communication abstraction. Our indepth performance analysis using a variety of workloads reveals several interesting observations and better insights on the relative merits of the four coscheduling schemes. First, in contrast to some previous results, where PB was shown as the best performer, we observe that SB and the proposed CC scheme outperform all other techniques on a Linux platform that dominates the current market place. This leads to the second conclusion that the choice of the native scheduler plays a significant role in deciding a competitive coscheduling strategy. Third, we find that SB and CC schemes are effective alternatives to batch processing even at a reasonable multiprogramming level of 4 to 6. Finally, we show that despite being an early prototype attempt, the proposed coscheduling scheme (CC) provides equal or slightly better performance than SB with the added advantages of flexibility, generality, and potential for incorporating specialized services such as QoS.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- US Department of Energy (US)
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 15002084
- Report Number(s):
- UCRL-JC-150927; TRN: US200408%%67
- Resource Relation:
- Conference: Association for Computing Machinery SIGMETRICS Conference, Marina Del Rey, CA (US), 06/15/2002--06/19/2002; Other Information: PBD: 4 Nov 2002
- Country of Publication:
- United States
- Language:
- English
Similar Records
Adaptive Parallel Job Scheduling with Flexible CoScheduling
Coscheduling Technique for Symmetric Multiprocessor Clusters