Evaluation of Techniques to Detect Significant Network Performance Problems using End-to-End Active Network Measurements
End-to-End fault and performance problems detection in wide area production networks is becoming increasingly hard as the complexity of the paths, the diversity of the performance, and dependency on the network increase. Several monitoring infrastructures are built to monitor different network metrics and collect monitoring information from thousands of hosts around the globe. Typically there are hundreds to thousands of time-series plots of network metrics which need to be looked at to identify network performance problems or anomalous variations in the traffic. Furthermore, most commercial products rely on a comparison with user configured static thresholds and often require access to SNMP-MIB information, to which a typical end-user does not usually have access. In our paper we propose new techniques to detect network performance problems proactively in close to realtime and we do not rely on static thresholds and SNMP-MIB information. We describe and compare the use of several different algorithms that we have implemented to detect persistent network problems using anomalous variations analysis in real end-to-end Internet performance measurements. We also provide methods and/or guidance for how to set the user settable parameters. The measurements are based on active probes running on 40 production network paths with bottlenecks varying from 0.5Mbits/s to 1000Mbit/s. For well behaved data (no missed measurements and no very large outliers) with small seasonal changes most algorithms identify similar events. We compare the algorithms' robustness with respect to false positives and missed events especially when there are large seasonal effects in the data. Our proposed techniques cover a wide variety of network paths and traffic patterns. We also discuss the applicability of the algorithms in terms of their intuitiveness, their speed of execution as implemented, and areas of applicability. Our encouraging results compare and evaluate the accuracy of our detection techniques when applied to step down/up, diurnal changes and congestion effects.
- Research Organization:
- SLAC National Accelerator Lab., Menlo Park, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC02-76SF00515
- OSTI ID:
- 875813
- Report Number(s):
- SLAC-PUB-11653; TRN: US200603%%268
- Resource Relation:
- Conference: Contributed to 10th IEEE / IFIP Network Operations and Management Symposium (NOMS 2006), Vancouver, Canada, 3-7 Apr 2006
- Country of Publication:
- United States
- Language:
- English
Similar Records
Multimodal Nondestructive Dry Cask Basket Structure and Spent Fuel Evaluation
ProgLIMI: Programmable LInk Metric Identification in Software-Defined Networks