Designing Size Consistent Statistics for Accurate Anomaly Detection in Dynamic Networks
Abstract
An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this article, we describe the challenges that face anomaly detection in dynamic network streams regarding confounding factors. We also provide two solutions to avoiding error due to confounding factors: the first is a randomization testing method that controls for confounding factors, and the second is a set of size-consistent network statistics that avoid confounding due to the most common factors, edge count and node count.
- Authors:
-
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Purdue Univ., West Lafayette, IN (United States)
- Publication Date:
- Research Org.:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1497323
- Report Number(s):
- LLNL-JRNL-738004
Journal ID: ISSN 1556-4681; 889202
- Grant/Contract Number:
- AC52-07NA27344
- Resource Type:
- Accepted Manuscript
- Journal Name:
- ACM Transactions on Knowledge Discovery from Data
- Additional Journal Information:
- Journal Volume: 20; Journal Issue: 2; Journal ID: ISSN 1556-4681
- Publisher:
- Association for Computing Machinery
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Fond, Timothy La, Neville, Jennifer, and Gallagher, Brian. Designing Size Consistent Statistics for Accurate Anomaly Detection in Dynamic Networks. United States: N. p., 2018.
Web. doi:10.1145/3185059.
Fond, Timothy La, Neville, Jennifer, & Gallagher, Brian. Designing Size Consistent Statistics for Accurate Anomaly Detection in Dynamic Networks. United States. https://doi.org/10.1145/3185059
Fond, Timothy La, Neville, Jennifer, and Gallagher, Brian. Mon .
"Designing Size Consistent Statistics for Accurate Anomaly Detection in Dynamic Networks". United States. https://doi.org/10.1145/3185059. https://www.osti.gov/servlets/purl/1497323.
@article{osti_1497323,
title = {Designing Size Consistent Statistics for Accurate Anomaly Detection in Dynamic Networks},
author = {Fond, Timothy La and Neville, Jennifer and Gallagher, Brian},
abstractNote = {An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this article, we describe the challenges that face anomaly detection in dynamic network streams regarding confounding factors. We also provide two solutions to avoiding error due to confounding factors: the first is a randomization testing method that controls for confounding factors, and the second is a set of size-consistent network statistics that avoid confounding due to the most common factors, edge count and node count.},
doi = {10.1145/3185059},
journal = {ACM Transactions on Knowledge Discovery from Data},
number = 2,
volume = 20,
place = {United States},
year = {2018},
month = {4}
}
Works referenced in this record:
Graph based anomaly detection and description: a survey
journal, July 2014
- Akoglu, Leman; Tong, Hanghang; Koutra, Danai
- Data Mining and Knowledge Discovery, Vol. 29, Issue 3
Randomization tests for distinguishing social influence and homophily effects
conference, January 2010
- La Fond, Timothy; Neville, Jennifer
- Proceedings of the 19th international conference on World wide web - WWW '10
Networks in transition: how industry events (re)shape interfirm relationships
journal, May 1998
- Madhavan, Ravindranath; Koka, Balaji R.; Prescott, John E.
- Strategic Management Journal, Vol. 19, Issue 5
How Do They ‘End Up Together’? A Social Network Analysis of Self-Control, Homophily, and Adolescent Relationships
journal, August 2010
- Young, Jacob T. N.
- Journal of Quantitative Criminology, Vol. 27, Issue 3
Introduction to stochastic actor-based models for network dynamics
journal, January 2010
- Snijders, Tom A. B.; van de Bunt, Gerhard G.; Steglich, Christian E. G.
- Social Networks, Vol. 32, Issue 1
Testing Multitheoretical, Multilevel Hypotheses About Organizational Networks: An Analytic Framework and Empirical Example
journal, July 2006
- Contractor, Noshir S.; Wasserman, Stanley; Faust, Katherine
- Academy of Management Review, Vol. 31, Issue 3
Predicting with networks: Nonparametric multiple regression analysis of dyadic data
journal, December 1988
- Krackhardt, David
- Social Networks, Vol. 10, Issue 4
Group formation in large social networks: membership, growth, and evolution
conference, January 2006
- Backstrom, Lars; Huttenlocher, Dan; Kleinberg, Jon
- Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '06
Comparison of distance matrices in studies of population structure and genetic microdifferentiation: Quadratic assignment
journal, November 1985
- Dow, Malcolm M.; Cheverud, James M.
- American Journal of Physical Anthropology, Vol. 68, Issue 3
Models for evolving fixed node networks: Model fitting and model testing
journal, January 1995
- Sanil, Ashish; Banks, David; Carley, Kathleen
- Social Networks, Vol. 17, Issue 1
8. Comparing Networks across Space and Time, Size and Species
journal, August 2002
- Faust, Katherine; Skvoretz, John
- Sociological Methodology, Vol. 32, Issue 1
RolX: structural role extraction & mining in large graphs
conference, January 2012
- Henderson, Keith; Gallagher, Brian; Eliassi-Rad, Tina
- Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12
D elta C on : A Principled Massive-Graph Similarity Function
conference, December 2013
- Koutra, Danai; Vogelstein, Joshua T.; Faloutsos, Christos
- Proceedings of the 2013 SIAM International Conference on Data Mining
A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks
journal, March 2017
- Yasami, Yasser; Safaei, Farshad
- Computer Communications, Vol. 100
Scan Statistics on Enron Graphs
journal, October 2005
- Priebe, Carey E.; Conroy, John M.; Marchette, David J.
- Computational and Mathematical Organization Theory, Vol. 11, Issue 3, p. 229-247
Fast Generation of Large Scale Social Networks While Incorporating Transitive Closures
conference, September 2012
- Pfeiffer, Joseph J.; La Fond, Timothy; Moreno, Sebastian
- 2012 International Conference on Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing
Web graph similarity for anomaly detection
journal, February 2010
- Papadimitriou, Panagiotis; Dasdan, Ali; Garcia-Molina, Hector
- Journal of Internet Services and Applications, Vol. 1, Issue 1
Micro-Level Interpretation of Exponential Random Graph Models with Application to Estuary Networks: Desmarais/Cranmer: Micro-Level Interpretation of ERGM
journal, August 2012
- Desmarais, Bruce A.; Cranmer, Skyler J.
- Policy Studies Journal, Vol. 40, Issue 3
Computing core/periphery structures and permutation tests for social relations data
journal, May 2006
- Boyd, John P.; Fitzgerald, William J.; Beck, Robert J.
- Social Networks, Vol. 28, Issue 2
A random graph model for massive graphs
conference, January 2000
- Aiello, William; Chung, Fan; Lu, Linyuan
- Proceedings of the thirty-second annual ACM symposium on Theory of computing - STOC '00
Influence and correlation in social networks
conference, January 2008
- Anagnostopoulos, Aris; Kumar, Ravi; Mahdian, Mohammad
- Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08
Predicting defects using network analysis on dependency graphs
conference, January 2008
- Zimmermann, Thomas; Nagappan, Nachiappan
- Proceedings of the 13th international conference on Software engineering - ICSE '08
Discrete temporal models of social networks
journal, January 2010
- Hanneke, Steve; Fu, Wenjie; Xing, Eric P.
- Electronic Journal of Statistics, Vol. 4, Issue 0
On the Exact Covariance of Products of Random Variables
journal, December 1969
- Bohrnstedt, George W.; Goldberger, Arthur S.
- Journal of the American Statistical Association, Vol. 64, Issue 328
The Coevolution of Networks and Political Attitudes
journal, August 2010
- Lazer, David; Rubineau, Brian; Chetkovich, Carol
- Political Communication, Vol. 27, Issue 3
Network science
journal, January 2007
- Börner, Katy; Sanyal, Soma; Vespignani, Alessandro
- Annual Review of Information Science and Technology, Vol. 41, Issue 1
New Specifications for Exponential Random Graph Models
journal, August 2006
- Snijders, Tom A. B.; Pattison, Philippa E.; Robins, Garry L.
- Sociological Methodology, Vol. 36, Issue 1
A survey of graph edit distance
journal, January 2009
- Gao, Xinbo; Xiao, Bing; Tao, Dacheng
- Pattern Analysis and Applications, Vol. 13, Issue 1
Korean university life in a network perspective: Dynamics of a large affiliation network
journal, January 2007
- Holme, Petter; Min Park, Sung; Kim, Beom Jun
- Physica A: Statistical Mechanics and its Applications, Vol. 373
The simultaneous evolution of author and paper networks
journal, February 2004
- Borner, K.; Maru, J. T.; Goldstone, R. L.
- Proceedings of the National Academy of Sciences, Vol. 101, Issue Supplement 1
Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks
journal, December 2009
- Aral, S.; Muchnik, L.; Sundararajan, A.
- Proceedings of the National Academy of Sciences, Vol. 106, Issue 51
Learning mixed kronecker product graph models with simulated method of moments
conference, January 2013
- Moreno, Sebastian I.; Neville, Jennifer; Kirshner, Sergey
- Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '13