skip to main content

DOE PAGESDOE PAGES

Title: Autonomic Management of Application Workflows on Hybrid Computing Infrastructure

In this paper, we present a programming and runtime framework that enables the autonomic management of complex application workflows on hybrid computing infrastructures. The framework is designed to address system and application heterogeneity and dynamics to ensure that application objectives and constraints are satisfied. The need for such autonomic system and application management is becoming critical as computing infrastructures become increasingly heterogeneous, integrating different classes of resources from high-end HPC systems to commodity clusters and clouds. For example, the framework presented in this paper can be used to provision the appropriate mix of resources based on application requirements and constraints. The framework also monitors the system/application state and adapts the application and/or resources to respond to changing requirements or environment. To demonstrate the operation of the framework and to evaluate its ability, we employ a workflow used to characterize an oil reservoir executing on a hybrid infrastructure composed of TeraGrid nodes and Amazon EC2 instances of various types. Specifically, we show how different applications objectives such as acceleration, conservation and resilience can be effectively achieved while satisfying deadline and budget constraints, using an appropriate mix of dynamically provisioned resources. Our evaluations also demonstrate that public clouds can be used tomore » complement and reinforce the scheduling and usage of traditional high performance computing infrastructure.« less
Authors:
 [1] ;  [2] ;  [1] ;  [3] ;  [1]
  1. NSF Center for Autonomic Computing, Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ, USA
  2. Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX, USA, Craft and Hawkins Department of Petroleum Engineering, Louisiana State University, Baton Rouge, LA, USA
  3. NSF Center for Autonomic Computing, Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ, USA, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA, Department of Computer Science, Louisiana State University, Baton Rouge, LA, USA
Publication Date:
OSTI Identifier:
1243135
Grant/Contract Number:
FG02-06ER54857; FG02-04ER46136
Type:
Published Article
Journal Name:
Scientific Programming
Additional Journal Information:
Journal Volume: 19; Journal Issue: 2-3; Related Information: CHORUS Timestamp: 2016-08-23 12:05:22; Journal ID: ISSN 1058-9244
Publisher:
Hindawi Publishing Corporation
Sponsoring Org:
USDOE
Country of Publication:
Egypt
Language:
English