Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Data-Driven Batch Scheduling A dissertation submitted
 

Summary: Data-Driven Batch Scheduling
by
John Bent
A dissertation submitted
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
(Computer Sciences)
at the
University of Wisconsin - Madison
2005
i
Abstract
In this thesis, we present a data-driven batch scheduling system. Current CPU-centric batch schedulers ignore
the data needs within workloads and execute them by linking them transparently and directly to their needed data.
When scheduled on remote computational resources, this elegant solution of direct data access can incur an order
of magnitude performance penalty for data-intensive workloads.
To concretely motivate this problem, we provide here a detailed analysis of six current data-intensive, scientific,
batch workloads. From this analysis, we derive quantitative bounds on expected scalability and demonstrate the
infeasibility of scheduling these workloads using current CPU-centric systems that lack data-awareness.

  

Source: Arpaci-Dusseau, Andrea - Department of Computer Sciences, University of Wisconsin at Madison

 

Collections: Computer Technologies and Information Sciences