skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Anomaly detection and diagnosis in Grid environments.

Abstract

Identifying and diagnosing anomalies in application behavior is critical to delivering reliable application-level performance. In this paper we introduce a strategy to detect anomalies and diagnose the possible reasons behind them. Our approach extends the traditional window-based strategy by using signal-processing techniques to filter out recurring, background fluctuations in resource behavior. In addition, we have developed a diagnosis technique that uses standard monitoring data to determine which related changes in behavior may cause anomalies. We evaluate our anomaly detection and diagnosis technique by applying it in three contexts when we insert anomalies into the system at random intervals. The experimental results show that our strategy detects up to 96% of anomalies while reducing the false positive rate by up to 90% compared to the traditional window average strategy. In addition, our strategy can diagnose the reason for the anomaly approximately 75% of the time.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
971154
Report Number(s):
ANL/MCS/CP-59796
TRN: US201003%%601
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: International Conference for High Performance Computing, Networking, Storage, and Analysis (SC07); Nov. 10, 2007 - Nov. 16, 2007; Reno, NV
Country of Publication:
United States
Language:
ENGLISH
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; DETECTION; DIAGNOSIS; FLUCTUATIONS; MONITORING; PERFORMANCE; STORAGE; WINDOWS

Citation Formats

Yang, L., Liu, C., Schopf, J. M., Foster, I., Mathematics and Computer Science, Univ. of Chicago, and Microsoft Corp.. Anomaly detection and diagnosis in Grid environments.. United States: N. p., 2007. Web. doi:10.1145/1362622.1362667.
Yang, L., Liu, C., Schopf, J. M., Foster, I., Mathematics and Computer Science, Univ. of Chicago, & Microsoft Corp.. Anomaly detection and diagnosis in Grid environments.. United States. doi:10.1145/1362622.1362667.
Yang, L., Liu, C., Schopf, J. M., Foster, I., Mathematics and Computer Science, Univ. of Chicago, and Microsoft Corp.. Mon . "Anomaly detection and diagnosis in Grid environments.". United States. doi:10.1145/1362622.1362667.
@article{osti_971154,
title = {Anomaly detection and diagnosis in Grid environments.},
author = {Yang, L. and Liu, C. and Schopf, J. M. and Foster, I. and Mathematics and Computer Science and Univ. of Chicago and Microsoft Corp.},
abstractNote = {Identifying and diagnosing anomalies in application behavior is critical to delivering reliable application-level performance. In this paper we introduce a strategy to detect anomalies and diagnose the possible reasons behind them. Our approach extends the traditional window-based strategy by using signal-processing techniques to filter out recurring, background fluctuations in resource behavior. In addition, we have developed a diagnosis technique that uses standard monitoring data to determine which related changes in behavior may cause anomalies. We evaluate our anomaly detection and diagnosis technique by applying it in three contexts when we insert anomalies into the system at random intervals. The experimental results show that our strategy detects up to 96% of anomalies while reducing the false positive rate by up to 90% compared to the traditional window average strategy. In addition, our strategy can diagnose the reason for the anomaly approximately 75% of the time.},
doi = {10.1145/1362622.1362667},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The planned large scale deployment of smart grid network devices will generate a large amount of information exchanged over various types of communication networks. The implementation of these critical systems will require appropriate cyber-security measures. A network anomaly detection solution is considered in this work. In common network architectures multiple communications streams are simultaneously present, making it difficult to build an anomaly detection solution for the entire system. In addition, common anomaly detection algorithms require specification of a sensitivity threshold, which inevitably leads to a tradeoff between false positives and false negatives rates. In order to alleviate these issues, thismore » paper proposes a novel anomaly detection architecture. The designed system applies the previously developed network security cyber-sensor method to individual selected communication streams allowing for learning accurate normal network behavior models. Furthermore, the developed system dynamically adjusts the sensitivity threshold of each anomaly detection algorithm based on domain knowledge about the specific network system. It is proposed to model this domain knowledge using Interval Type-2 Fuzzy Logic rules, which linguistically describe the relationship between various features of the network communication and the possibility of a cyber attack. The proposed method was tested on experimental smart grid system demonstrating enhanced cyber-security.« less
  • The consolidation of cyber communications networks and physical control systems within the energy smart grid introduces a number of new risks. Unfortunately, these risks are largely unknown and poorly understood, yet include very high impact losses from attack and component failures. One important aspect of risk management is the detection of anomalies and changes. However, anomaly detection within cyber security remains a difficult, open problem, with special challenges in dealing with false alert rates and heterogeneous data. Furthermore, the integration of cyber and physical dynamics is often intractable. And, because of their broad scope, energy grid cyber-physical systems must bemore » analyzed at multiple scales, from individual components, up to network level dynamics. We describe an improved approach to anomaly detection that combines three important aspects. First, system dynamics are modeled using a reduced order model for greater computational tractability. Second, a probabilistic and principled approach to anomaly detection is adopted that allows for regulation of false alerts and comparison of anomalies across heterogeneous data sources. Third, a hierarchy of aggregations are constructed to support interactive and automated analyses of anomalies at multiple scales.« less
  • This report describes work with the goal of enhancing capabilities in computer intrusion detection. The work builds upon a study of classification performance, that compared various methods of classifying information derived from computer network packets into attack versus normal categories, based on a labeled training dataset. This previous work validates our classification methods, and clears the ground for studying whether and how anomaly detection can be used to enhance this performance, The DARPA project that initiated the dataset used here concluded that anomaly detection should be examined to boost the performance of machine learning in the computer intrusion detection task.more » This report investigates the data set for aspects that will be valuable for anomaly detection application, and supports these results with models constructed from the data. In this report, the term anomaly detection means learning a model from unlabeled data, and using this to make some inference about future data. Our data is a feature vector derived from network packets: an 'example' or 'sample'. On the other hand, classification means building a model from labeled data, and using that model to classify unlabeled (future) examples. There is some precedent in the literature for combining these methods. One approach is to stage the two techniques, using anomaly detection to segment data into two sets for classification. An interpretation of this is a method to combat nonstationarity in the data. In our previous work, we demonstrated that the data has substantial temporal nonstationarity. With classification methods that can be thought of as learning a decision surface between two statistical distributions, performance is expected to degrade significantly when classifying examples that are from regions not well represented in the training set. Anomaly detection can be seen as a problem of learning the density (landscape) or the support (boundary) of a statistical distribution so that, this characterization can be compared to data points. Nonstationarity can then be thought of as data that departs from the support of the distribution. Since we can judge that these 'anomalous' examples will be classified poorly, we can treat them difFereritly (or not at all). A second approach uses momaly detection with an assumption that any examples that are different are suspicious, which is an assumption that may or may not be true in an application. We will call this the Outlier Assumption. With this assumption there are simply the performance gains to be had from combining models that have uncorrelated errors into an ensemble with better performance than any of the individual models. This family of techniques has many names, including model averaging, multiple regression, and the very popular boosting approaches. In this approach the two methods are 'peer' results, which are then combined to generate a final result. Staged anomaly detection with the outlier assumption can also be used to create data sub-categories into which the classification method is specifically tuned, or vice-versa. This is an avenue for further work in this application area, and will not be demonstrated in this study. As in our previous work, this report does not attempt to address issues in dataset generation or feature selection. The details of the network and data collection process as well as the way in which this 'raw data' is transformed into well-defined feature vectors is a very important problem. However that exploration is beyond the scope of this effort.« less
  • The Network Anomaly Detection and Intrusion Reporter (NADIR) is an expert system which is intended to provide real-time security auditing for intrusion and misuse detection at Los Alamos National Laboratory's Integrated Computing Network (ICN). It is based on three basic assumptions: that statistical analysis of computer system and user activities may be used to characterize normal system and user behavior, and that given the resulting statistical profiles, behavior which deviates beyond certain bounds can be detected, that expert system techniques can be applied to security auditing and intrusion detection, and that successful intrusion detection may take place while monitoring amore » limited set of network activities such as user authentication and access control, file movement and storage, and job scheduling. NADIR has been developed to employ these basic concepts while monitoring the audited activities of more than 8000 ICN users.« less
  • In this paper we discuss two advanced techniques, process fault detection and nonlinear time series analysis, and apply them to the analysis of vector-valued and single-valued time-series data. We investigate model-based process fault detection methods for analyzing simulated, multivariate, time-series data from a three-tank system. The model-predictions are compared with simulated measurements of the same variables to form residual vectors that are tested for the presence of faults (possible diversions in safeguards terminology). We evaluate two methods, testing all individual residuals with a univariate z-score and testing all variables simultaneously with the Mahalanobis distance, for their ability to detect lossmore » of material from two different leak scenarios from the three-tank system: a leak without and with replacement of the lost volume. Nonlinear time-series analysis tools were compared with the linear methods popularized by Box and Jenkins. We compare prediction results using three nonlinear and two linear modeling methods on each of six simulated time series: two nonlinear and four linear. The nonlinear methods performed better at predicting the nonlinear time series and did as well as the linear methods at predicting the linear values.« less