| | |
Summary: Explicit Control in a BatchAware Distributed File System
John Bent, Douglas Thain,
Andrea C. ArpaciDusseau, Remzi H. ArpaciDusseau, and Miron Livny
Computer Sciences Department, University of Wisconsin, Madison
Abstract
We present the design, implementation, and evaluation
of the BatchAware Distributed File System (BADFS),
a system designed to orchestrate large, I/Ointensive
batch workloads on remote computing clusters distributed
across the wide area. BADFS consists of two novel com
ponents: a storage layer that exposes control of tradi
tionally fixed policies such as caching, consistency, and
replication; and a scheduler that exploits this control as
necessary for different workloads. By extracting control
from the storage layer and placing it within an external
scheduler, BADFS manages both storage and computa
tion in a coordinated way while gracefully dealing with
cache consistency, faulttolerance, and space manage
ment issues in a workloadspecific manner. Using both
microbenchmarks and real workloads, we demonstrate
|