| | |
Summary: Pipeline and Batch Sharing in Grid Workloads
Douglas Thain, John Bent, Andrea C. ArpaciDusseau, Remzi H. ArpaciDusseau, and Miron Livny
Computer Sciences Department, University of Wisconsin, Madison
Abstract
We present a study of six batchpipelined scientific
workloads that are candidates for execution on computa
tional grids. Whereas other studies focus on the behavior
of single applications, this study characterizes workloads
composed of pipelines of sequential processes that use file
storage for communication and also share significant data
across a batch. This study includes measurements of the
memory, CPU, and I/O requirements of individual compo
nents as well as analyses of I/O sharing within complete
batches. We conclude with a discussion of the ramifications
of these workloads for endtoend scalability and overall
system design.
1. Introduction
For many years, researchers have understood the impor
tance of studying workload characteristics in order to eval
uate their impact on current and future systems architec
|