# System and method for anomaly detection

## Abstract

A system and method for detecting one or more anomalies in a plurality of observations is provided. In one illustrative embodiment, the observations are real-time network observations collected from a stream of network traffic. The method includes performing a discrete decomposition of the observations, and introducing derived variables to increase storage and query efficiencies. A mathematical model, such as a conditional independence model, is then generated from the formatted data. The formatted data is also used to construct frequency tables which maintain an accurate count of specific variable occurrence as indicated by the model generation process. The formatted data is then applied to the mathematical model to generate scored data. The scored data is then analyzed to detect anomalies.

- Inventors:

- Publication Date:

- Research Org.:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

- Sponsoring Org.:
- USDOE

- OSTI Identifier:
- 1176372

- Patent Number(s):
- 7,739,082

- Application Number:
- 11/423,046

- Assignee:
- Battelle Memorial Institute (Richland, WA) PNNL

- DOE Contract Number:
- AC05-76RL01830

- Resource Type:
- Patent

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 97 MATHEMATICS AND COMPUTING

### Citation Formats

```
Scherrer, Chad.
```*System and method for anomaly detection*. United States: N. p., 2010.
Web.

```
Scherrer, Chad.
```*System and method for anomaly detection*. United States.

```
Scherrer, Chad. Tue .
"System and method for anomaly detection". United States. https://www.osti.gov/servlets/purl/1176372.
```

```
@article{osti_1176372,
```

title = {System and method for anomaly detection},

author = {Scherrer, Chad},

abstractNote = {A system and method for detecting one or more anomalies in a plurality of observations is provided. In one illustrative embodiment, the observations are real-time network observations collected from a stream of network traffic. The method includes performing a discrete decomposition of the observations, and introducing derived variables to increase storage and query efficiencies. A mathematical model, such as a conditional independence model, is then generated from the formatted data. The formatted data is also used to construct frequency tables which maintain an accurate count of specific variable occurrence as indicated by the model generation process. The formatted data is then applied to the mathematical model to generate scored data. The scored data is then analyzed to detect anomalies.},

doi = {},

journal = {},

number = ,

volume = ,

place = {United States},

year = {2010},

month = {6}

}