Quantifying Scheduling Challenges for Exascale System Software
- University of New Mexico, Albuquerque
- ORNL
The move towards high-performance computing (HPC) ap- plications comprised of coupled codes and the need to dra- matically reduce data movement is leading to a reexami- nation of time-sharing vs. space-sharing in HPC systems. In this paper, we discuss and begin to quantify the perfor- mance impact of a move away from strict space-sharing of nodes for HPC applications. Specifically, we examine the po- tential performance cost of time-sharing nodes between ap- plication components, we determine whether a simple coor- dinated scheduling mechanism can address these problems, and we research how suitable simple constraint-based opti- mization techniques are for solving scheduling challenges in this regime. Our results demonstrate that current general- purpose HPC system software scheduling and resource al- location systems are subject to significant performance de- ciencies which we quantify for six representative applica- tions. Based on these results, we discuss areas in which ad- ditional research is needed to meet the scheduling challenges of next-generation HPC systems.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1265620
- Resource Relation:
- Conference: ROSS '15 Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, Portland, OR, USA, 20150616, 20150616
- Country of Publication:
- United States
- Language:
- English
Similar Records
The Practical Obstacles of Data Transfer: Why researchers still love scp
Performance Prediction of Big Data Transfer Through Experimental Analysis and Machine Learning