| | |
Summary: Explicit Control in a Batch-Aware Distributed File System
John Bent, Douglas Thain,
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Miron Livny
Computer Sciences Department, University of Wisconsin, Madison
Abstract
We present the design, implementation, and evaluation
of the Batch-Aware Distributed File System (BAD-FS),
a system designed to orchestrate large, I/O-intensive
batch workloads on remote computing clusters distributed
across the wide area. BAD-FS consists of two novel com-
ponents: a storage layer that exposes control of tradi-
tionally fixed policies such as caching, consistency, and
replication; and a scheduler that exploits this control as
necessary for different workloads. By extracting control
from the storage layer and placing it within an external
scheduler, BAD-FS manages both storage and computa-
tion in a coordinated way while gracefully dealing with
cache consistency, fault-tolerance, and space manage-
ment issues in a workload-specific manner. Using both
microbenchmarks and real workloads, we demonstrate
|