| | |
Summary: Pipeline and Batch Sharing in Grid Workloads
Douglas Thain, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Miron Livny
Computer Sciences Department, University of Wisconsin, Madison
Abstract
We present a study of six batch-pipelined scientific
workloads that are candidates for execution on computa-
tional grids. Whereas other studies focus on the behavior
of single applications, this study characterizes workloads
composed of pipelines of sequential processes that use file
storage for communication and also share significant data
across a batch. This study includes measurements of the
memory, CPU, and I/O requirements of individual compo-
nents as well as analyses of I/O sharing within complete
batches. We conclude with a discussion of the ramifications
of these workloads for end-to-end scalability and overall
system design.
1. Introduction
For many years, researchers have understood the impor-
tance of studying workload characteristics in order to eval-
uate their impact on current and future systems architec-
|