Data Outlier Detection using the Chebyshev Theorem
During data collection and analysis, it is often necessary to identify and possibly remove outliers that exist. It is often critical to have an objective method of identifying outliers to be removed. There are many automated outlier detection methods, however, many are limited by assumptions of a distribution or they require upper and lower pre-defined boundaries in which the data should exist. If there is a known distribution for the data, then using that distribution can aid in finding outliers. Often, a distribution is not known, or the experimenter does not want to make an assumption about a certain distribution. Also, enough information may not exist about a set of data to be able to determine reliable upper and lower boundaries. For these cases, an outlier detection method, using the empirical data and based upon Chebyshev's inequality, was formed. This method also allows for detection of multiple outliers, not just one at a time.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 877016
- Report Number(s):
- PNWD-SA-6701; TRN: US200608%%248
- Resource Relation:
- Conference: 2005 IEEE Aerospace Conference, 1-6
- Country of Publication:
- United States
- Language:
- English
Similar Records
Detecting outliers in streaming time series data from ARM distributed sensors
Damage detection in mechanical structures using extreme value statistic.