skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Accelerating Parallel Applications in Cloud Platforms via Adaptive Time-Slice Control

Abstract

Cloud platforms can provide flexible and cost-effective environments for parallel applications. However, the resource over-commitment issues, i.e., cloud providers often provide much more executable virtual CPUs than available physical CPUs, still impede the synchronization operations of parallel applications, causing severe performance degradation. Existing methods optimize parallel applications by promoting the priorities of involved VMs. They cannot fully explore the performance of parallel applications, because they ignore the time-slice requirements of different phases of parallel applications. Furthermore, non-parallel applications experience unsatisfied performance because of low scheduling priorities. Given empirical analysis on time-slices of virtual machines (VMs), we find that shortening time-slices can mitigate synchronization overhead which incurs during communication phases, while over-short time-slices cause frequent cache misses in computation phases. Accordingly, we propose an Adaptive Time-slice Control (ATC) mechanism. ATC first detects the phases of parallel applications based on lock latency or cache misses. Then, ATC shortens time-slices during communication phases and prolongs time-slices during computation phases for parallel applications, and sets a uniform time-slice for non-parallel applications. Finally, we evaluate ATC using seven well-known benchmarks with 25+ applications. Experiments show that ATC obtains 1.5-75x performance gain for running parallel applications than state-of-the-art solutions, with nearly unaffected impact on non-parallel applications.

Authors:
 [1]; ORCiD logo [1];  [2];  [3]; ORCiD logo [4]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]
  1. Huazhong Univ. of Science and Technology, Wuhan (China)
  2. Tencent Group, Guangdong (China)
  3. Alibaba Group, Zhejiang (China)
  4. Argonne National Lab. (ANL), Lemont, IL (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC); National Key Research and Development Program of China; National Science Foundation of China
OSTI Identifier:
1863258
Grant/Contract Number:  
AC02-06CH11357; 2018YFB1004805; 61872155; 61732010; 2019aea171
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
IEEE Transactions on Computers
Additional Journal Information:
Journal Volume: 70; Journal Issue: 7; Journal ID: ISSN 0018-9340
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; LHP problem; Parallel application; cache miss rate; cloud platform; lock latency; synchronization overhead

Citation Formats

Fan, Hao, Wu, Song, Zhao, Xinyu, Xie, Zhenjiang, Di, Sheng, Xiao, Jiang, Yu, Chen, and Jin, Hai. Accelerating Parallel Applications in Cloud Platforms via Adaptive Time-Slice Control. United States: N. p., 2020. Web. doi:10.1109/tc.2020.2999619.
Fan, Hao, Wu, Song, Zhao, Xinyu, Xie, Zhenjiang, Di, Sheng, Xiao, Jiang, Yu, Chen, & Jin, Hai. Accelerating Parallel Applications in Cloud Platforms via Adaptive Time-Slice Control. United States. https://doi.org/10.1109/tc.2020.2999619
Fan, Hao, Wu, Song, Zhao, Xinyu, Xie, Zhenjiang, Di, Sheng, Xiao, Jiang, Yu, Chen, and Jin, Hai. 2020. "Accelerating Parallel Applications in Cloud Platforms via Adaptive Time-Slice Control". United States. https://doi.org/10.1109/tc.2020.2999619. https://www.osti.gov/servlets/purl/1863258.
@article{osti_1863258,
title = {Accelerating Parallel Applications in Cloud Platforms via Adaptive Time-Slice Control},
author = {Fan, Hao and Wu, Song and Zhao, Xinyu and Xie, Zhenjiang and Di, Sheng and Xiao, Jiang and Yu, Chen and Jin, Hai},
abstractNote = {Cloud platforms can provide flexible and cost-effective environments for parallel applications. However, the resource over-commitment issues, i.e., cloud providers often provide much more executable virtual CPUs than available physical CPUs, still impede the synchronization operations of parallel applications, causing severe performance degradation. Existing methods optimize parallel applications by promoting the priorities of involved VMs. They cannot fully explore the performance of parallel applications, because they ignore the time-slice requirements of different phases of parallel applications. Furthermore, non-parallel applications experience unsatisfied performance because of low scheduling priorities. Given empirical analysis on time-slices of virtual machines (VMs), we find that shortening time-slices can mitigate synchronization overhead which incurs during communication phases, while over-short time-slices cause frequent cache misses in computation phases. Accordingly, we propose an Adaptive Time-slice Control (ATC) mechanism. ATC first detects the phases of parallel applications based on lock latency or cache misses. Then, ATC shortens time-slices during communication phases and prolongs time-slices during computation phases for parallel applications, and sets a uniform time-slice for non-parallel applications. Finally, we evaluate ATC using seven well-known benchmarks with 25+ applications. Experiments show that ATC obtains 1.5-75x performance gain for running parallel applications than state-of-the-art solutions, with nearly unaffected impact on non-parallel applications.},
doi = {10.1109/tc.2020.2999619},
url = {https://www.osti.gov/biblio/1863258}, journal = {IEEE Transactions on Computers},
issn = {0018-9340},
number = 7,
volume = 70,
place = {United States},
year = {2020},
month = {6}
}

Works referenced in this record:

Time-Sharing Parallel Applications with Performance Isolation and Control
conference, June 2007


Cloud versus in-house cluster: evaluating Amazon cluster compute instances for running MPI applications
conference, January 2011


Performance implications of virtualizing multicore cluster machines
conference, January 2008

  • Ranadive, Adit; Kesavan, Mukil; Gavrilovska, Ada
  • Proceedings of the 2nd workshop on System-level virtualization for high performance computing - HPCVirt '08
  • https://doi.org/10.1145/1435452.1435453

Implicit coscheduling: coordinated scheduling with implicit information in distributed systems
journal, August 2001


The hybrid scheduling framework for virtual machine systems
conference, January 2009

  • Weng, Chuliang; Wang, Zhigang; Li, Minglu
  • Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments - VEE '09
  • https://doi.org/10.1145/1508293.1508309

Dynamic adaptive scheduling for virtual machines
conference, January 2011


A bridging model for parallel computation
journal, August 1990


Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control
conference, May 2016


vScale: automatic and efficient processor scaling for SMP virtual machines
conference, April 2016

  • Cheng, Luwei; Rao, Jia; Lau, Francis C. M.
  • EuroSys '16: Eleventh EuroSys Conference 2016, Proceedings of the Eleventh European Conference on Computer Systems
  • https://doi.org/10.1145/2901318.2901321

Demand-based coordinated scheduling for SMP VMs
conference, January 2013

  • Kim, Hwanju; Kim, Sangwook; Jeong, Jinkyu
  • Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13
  • https://doi.org/10.1145/2451116.2451156

Synchronization-Aware Scheduling for Virtual Clusters in Cloud
journal, October 2015


The Scalasca performance toolset architecture
journal, January 2010


vSlicer: latency-aware virtual machine scheduling via differentiated-frequency CPU slicing
conference, January 2012


Xen and the art of virtualization
conference, January 2003


The impact of management operations on the virtualized datacenter
conference, January 2010


The NAS parallel benchmarks---summary and preliminary results
conference, January 1991


The PARSEC benchmark suite: characterization and architectural implications
conference, January 2008

  • Bienia, Christian; Kumar, Sanjeev; Singh, Jaswinder Pal
  • Proceedings of the 17th international conference on Parallel architectures and compilation techniques - PACT '08
  • https://doi.org/10.1145/1454115.1454128

Is co-scheduling too expensive for SMP VMs?
conference, January 2011


Towards fair and efficient SMP virtual machine scheduling
conference, January 2014


Dynamic Switching-Frequency Scaling: Scheduling Overcommitted Domains in Xen VMM
conference, September 2010


Flexible resource allocation for reliable virtual cluster computing systems
conference, January 2011

  • Hacker, Thomas J.; Mahadik, Kanak
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • https://doi.org/10.1145/2063384.2063448

Threads vs. caches: Modeling the behavior of parallel workloads
conference, October 2010


Supporting Overcommitted Virtual Machines through Hardware Spin Detection
journal, February 2012


Characterizing and Optimizing the Performance of Multithreaded Programs Under Interference
conference, September 2016

  • Zhao, Yong; Rao, Jia; Yi, Qing
  • PACT '16: International Conference on Parallel Architectures and Compilation, Proceedings of the 2016 International Conference on Parallel Architectures and Compilation
  • https://doi.org/10.1145/2967938.2967939

Micro-Sliced Virtual Processors to Hide the Effect of Discontinuous CPU Availability for Consolidated Systems
conference, December 2014


Perfctr-Xen: a framework for performance counter virtualization
journal, July 2011