skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Detecting Anomalies from End-to-End Internet Performance Measurements (PingER) Using Cluster Based Local Outlier Factor

Abstract

PingER (Ping End-to-End Reporting) is a worldwide end-to-end Internet performance measurement framework. It was developed by the SLAC National Accelerator Laboratory, Stanford, USA and running from the last 20 years. It has more than 700 monitoring agents and remote sites which monitor the performance of Internet links around 170 countries of the world. At present, the size of the compressed PingER data set is about 60 GB comprising of 100,000 flat files. The data is publicly available for valuable Internet performance analyses. However, the data sets suffer from missing values and anomalies due to congestion, bottleneck links, queuing overflow, network software misconfiguration, hardware failure, cable cuts, and social upheavals. Therefore, the objective of this paper is to detect such performance drops or spikes labeled as anomalies or outliers for the PingER data set. In the proposed approach, the raw text files of the data set are transformed into a PingER dimensional model. The missing values are imputed using the k-NN algorithm. The data is partitioned into similar instances using the k-means clustering algorithm. Afterward, clustering is integrated with the Local Outlier Factor (LOF) using the Cluster Based Local Outlier Factor (CBLOF) algorithm to detect the anomalies or outliers from themore » PingER data. Lastly, anomalies are further analyzed to identify the time frame and location of the hosts generating the major percentage of the anomalies in the PingER data set ranging from 1998 to 2016.« less

Authors:
 [1];  [1]; ORCiD logo [2];  [3]
  1. Guangzhou Univ., Guangzhou (People's Republic of China)
  2. Stanford Linear Accelerator Center, Palo Alto, CA (United States)
  3. Univ. of Agriculture, Faisalabad (Pakistan)
Publication Date:
Research Org.:
SLAC National Accelerator Lab., Menlo Park, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1440521
Grant/Contract Number:  
[AC02-76SF00515]
Resource Type:
Accepted Manuscript
Journal Name:
IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum
Additional Journal Information:
[ Journal Volume: 2017; Conference: 15. IEEE International Symposium on Parallel and Distributed Processing with Applications and 16. IEEE International Conference on Ubiquitous Computing and Communications, Guangzhou (China), 12-15 Dec 2017]; Journal ID: ISSN 2164-7062
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Internet performance measurements; clustering; local outlier factor; anomaly detection

Citation Formats

Ali, Saqib, Wang, Guojun, Cottrell, Roger Leslie, and Anwar, Tayyba. Detecting Anomalies from End-to-End Internet Performance Measurements (PingER) Using Cluster Based Local Outlier Factor. United States: N. p., 2018. Web. doi:10.1109/ISPA/IUCC.2017.00150.
Ali, Saqib, Wang, Guojun, Cottrell, Roger Leslie, & Anwar, Tayyba. Detecting Anomalies from End-to-End Internet Performance Measurements (PingER) Using Cluster Based Local Outlier Factor. United States. doi:10.1109/ISPA/IUCC.2017.00150.
Ali, Saqib, Wang, Guojun, Cottrell, Roger Leslie, and Anwar, Tayyba. Mon . "Detecting Anomalies from End-to-End Internet Performance Measurements (PingER) Using Cluster Based Local Outlier Factor". United States. doi:10.1109/ISPA/IUCC.2017.00150. https://www.osti.gov/servlets/purl/1440521.
@article{osti_1440521,
title = {Detecting Anomalies from End-to-End Internet Performance Measurements (PingER) Using Cluster Based Local Outlier Factor},
author = {Ali, Saqib and Wang, Guojun and Cottrell, Roger Leslie and Anwar, Tayyba},
abstractNote = {PingER (Ping End-to-End Reporting) is a worldwide end-to-end Internet performance measurement framework. It was developed by the SLAC National Accelerator Laboratory, Stanford, USA and running from the last 20 years. It has more than 700 monitoring agents and remote sites which monitor the performance of Internet links around 170 countries of the world. At present, the size of the compressed PingER data set is about 60 GB comprising of 100,000 flat files. The data is publicly available for valuable Internet performance analyses. However, the data sets suffer from missing values and anomalies due to congestion, bottleneck links, queuing overflow, network software misconfiguration, hardware failure, cable cuts, and social upheavals. Therefore, the objective of this paper is to detect such performance drops or spikes labeled as anomalies or outliers for the PingER data set. In the proposed approach, the raw text files of the data set are transformed into a PingER dimensional model. The missing values are imputed using the k-NN algorithm. The data is partitioned into similar instances using the k-means clustering algorithm. Afterward, clustering is integrated with the Local Outlier Factor (LOF) using the Cluster Based Local Outlier Factor (CBLOF) algorithm to detect the anomalies or outliers from the PingER data. Lastly, anomalies are further analyzed to identify the time frame and location of the hosts generating the major percentage of the anomalies in the PingER data set ranging from 1998 to 2016.},
doi = {10.1109/ISPA/IUCC.2017.00150},
journal = {IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum},
number = ,
volume = [2017],
place = {United States},
year = {2018},
month = {5}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: