| | |
Summary: ParaTimer: A Progress Indicator for MapReduce DAGs
Kristi Morton, Magdalena Balazinska, and Dan Grossman
Computer Science and Engineering Department, University of Washington
Seattle, Washington, USA
{kmorton, magda, djg}@cs.washington.edu
ABSTRACT
Time-oriented progress estimation for parallel queries is a
challenging problem that has received only limited attention.
In this paper, we present ParaTimer, a new type of time-
remaining indicator for parallel queries. Several parallel
data processing systems exist. ParaTimer targets environ-
ments where declarative queries are translated into ensem-
bles of MapReduce jobs. ParaTimer builds on previous tech-
niques and makes two key contributions. First, it estimates
the progress of queries that translate into directed acyclic
graphs of MapReduce jobs, where jobs on different paths
can execute concurrently (unlike prior work that looked at
sequences only). For such queries, we use a new type of
critical-path-based progress-estimation approach. Second,
ParaTimer handles a variety of real systems challenges such
|