A quantitative model of application slowdown in multiresource shared systems
Scheduling multiple jobs onto a platform enhances system utilization by sharing resources. The benefits from higher resource utilization include reduced cost to construct, operate, and maintain a system, which often include energy consumption. Maximizing these benefits comes at a priceresource contention among jobs increases job completion time. In this study, we analyze slowdowns of jobs due to contention for multiple resources in a system; referred to as dilation factor. We observe that multipleresource contention creates nonlinear dilation factors of jobs. From this observation, we establish a general quantitative model for dilation factors of jobs in multiresource systems. A job is characterized by a vectorvalued loading statistics and dilation factors of a job set are given by a quadratic function of their loading vectors. We demonstrate how to systematically characterize a job, maintain the data structure to calculate the dilation factor (loading matrix), and calculate the dilation factor of each job. We validate the accuracy of the model with multiple processes running on a native Linux server, virtualized servers, and with multiple MapReduce workloads coscheduled in a cluster. Evaluation with measured data shows that the Dfactor model has an error margin of less than 16%. We extended the Dfactor model tomore »
 Authors:

^{[1]};
^{[2]}
 Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computational Data Analytics Group
 Sogang Univ., Seoul (Korea, Republic of). Dept. of Computer Science and Engineering
 Publication Date:
 Grant/Contract Number:
 AC0500OR22725; R0190152012; 2015R1C1A1A0152105
 Type:
 Accepted Manuscript
 Journal Name:
 Performance Evaluation
 Additional Journal Information:
 Journal Volume: 108; Journal ID: ISSN 01665316
 Research Org:
 Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sogang Univ., Seoul (Korea, Republic of)
 Sponsoring Org:
 USDOE; Ministry of Science, ICT and Future Planning (MSIP) of Korea; National Research Foundation of Korea (NRF)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING; 99 GENERAL AND MISCELLANEOUS; Modeling technique; Performance of systems; Measurement
 OSTI Identifier:
 1339395
 Alternate Identifier(s):
 OSTI ID: 1413801
Lim, SeungHwan, and Kim, Youngjae. A quantitative model of application slowdown in multiresource shared systems. United States: N. p.,
Web. doi:10.1016/j.peva.2016.10.004.
Lim, SeungHwan, & Kim, Youngjae. A quantitative model of application slowdown in multiresource shared systems. United States. doi:10.1016/j.peva.2016.10.004.
Lim, SeungHwan, and Kim, Youngjae. 2016.
"A quantitative model of application slowdown in multiresource shared systems". United States.
doi:10.1016/j.peva.2016.10.004. https://www.osti.gov/servlets/purl/1339395.
@article{osti_1339395,
title = {A quantitative model of application slowdown in multiresource shared systems},
author = {Lim, SeungHwan and Kim, Youngjae},
abstractNote = {Scheduling multiple jobs onto a platform enhances system utilization by sharing resources. The benefits from higher resource utilization include reduced cost to construct, operate, and maintain a system, which often include energy consumption. Maximizing these benefits comes at a priceresource contention among jobs increases job completion time. In this study, we analyze slowdowns of jobs due to contention for multiple resources in a system; referred to as dilation factor. We observe that multipleresource contention creates nonlinear dilation factors of jobs. From this observation, we establish a general quantitative model for dilation factors of jobs in multiresource systems. A job is characterized by a vectorvalued loading statistics and dilation factors of a job set are given by a quadratic function of their loading vectors. We demonstrate how to systematically characterize a job, maintain the data structure to calculate the dilation factor (loading matrix), and calculate the dilation factor of each job. We validate the accuracy of the model with multiple processes running on a native Linux server, virtualized servers, and with multiple MapReduce workloads coscheduled in a cluster. Evaluation with measured data shows that the Dfactor model has an error margin of less than 16%. We extended the Dfactor model to capture the slowdown of applications when multiple identical resources exist such as multicore environments and multidisks environments. Finally, validation results of the extended Dfactor model with HPC checkpoint applications on the parallel file systems show that Dfactor accurately captures the slow down of concurrent applications in such environments.},
doi = {10.1016/j.peva.2016.10.004},
journal = {Performance Evaluation},
number = ,
volume = 108,
place = {United States},
year = {2016},
month = {12}
}