skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Load Balancing for Multiphysics. In: AIAA 2013-2856,

Abstract

Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of themore » physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.« less

Authors:
 [1];  [1]
  1. George Mason Univ., Fairfax, VA (United States). Dept. of Computational and Data Science; SAIC, McLean, VA (United States). Advanced Technology Group
Publication Date:
Research Org.:
Oak Ridge National Laboratory, Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1567671
Resource Type:
Conference
Resource Relation:
Conference: 21st AIAA Computational Fluid Dynamics Conference, June 24-27, 2013, San Diego, CA
Country of Publication:
United States
Language:
English

Citation Formats

Lohner, Rainald, and Baum, Joseph D. Load Balancing for Multiphysics. In: AIAA 2013-2856,. United States: N. p., 2013. Web. doi:10.2514/6.2013-2856.
Lohner, Rainald, & Baum, Joseph D. Load Balancing for Multiphysics. In: AIAA 2013-2856,. United States. doi:10.2514/6.2013-2856.
Lohner, Rainald, and Baum, Joseph D. Sat . "Load Balancing for Multiphysics. In: AIAA 2013-2856,". United States. doi:10.2514/6.2013-2856.
@article{osti_1567671,
title = {Load Balancing for Multiphysics. In: AIAA 2013-2856,},
author = {Lohner, Rainald and Baum, Joseph D.},
abstractNote = {Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of the physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.},
doi = {10.2514/6.2013-2856},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2013},
month = {6}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs
journal, January 1999


Parallel optimisation algorithms for multilevel mesh partitioning
journal, November 2000