| | |
Summary: Towards Highly Reliable Enterprise Network Services
Via Inference of Multi-level Dependencies
Paramvir Bahl, Ranveer Chandra, Albert Greenberg, Srikanth Kandula,
David A. Maltz, Ming Zhang
Microsoft Research
Abstract
Localizing the sources of performance problems in large enterprise
networks is extremely challenging. Dependencies are numerous,
complex and inherently multi-level, spanning hardware and soft-
ware components across the network and the computing infrastruc-
ture. To exploit these dependencies for fast, accurate problem lo-
calization, we introduce an Inference Graph model, which is well-
adapted to user-perceptible problems rooted in conditions giving
rise to both partial service degradation and hard faults. Further, we
introduce the Sherlock system to discover Inference Graphs in the
operational enterprise, infer critical attributes, and then leverage the
result to automatically detect and localize problems. To illuminate
strengths and limitations of the approach, we provide results from a
prototype deployment in a large enterprise network, as well as from
testbed emulations and simulations. In particular, we find that tak-
|