## GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns

## Abstract

The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discovery that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrentmore »

- Authors:

- Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Univ. Of Hawai‘i at Mānoa, Honolulu, HI (United States). Collaborative Software Development Lab.
- George Mason Univ., Fairfax, VA (United States)
- Univ. of Maryland Baltimore County (UMBC), Baltimore, MD (United States)
- US Army Corps of Engineers (USACE) Engineer Research and Development Center (ERDC), Washington DC (United States)

- Publication Date:

- Research Org.:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

- Sponsoring Org.:
- USDOE; National Science Foundation (NSF)

- OSTI Identifier:
- 1432619

- Report Number(s):
- LA-UR-17-21945

Journal ID: ISSN 1556-4681

- Grant/Contract Number:
- AC52-06NA25396; 1218325; 1218318

- Resource Type:
- Accepted Manuscript

- Journal Name:
- ACM Transactions on Knowledge Discovery from Data

- Additional Journal Information:
- Journal Volume: 12; Journal Issue: 1; Journal ID: ISSN 1556-4681

- Publisher:
- Association for Computing Machinery

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 97 MATHEMATICS AND COMPUTING; information systems; data mining; time series; algorithms; software; interactive data mining

### Citation Formats

```
Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, and Frankenstein, Susan. GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns. United States: N. p., 2018.
Web. doi:10.1145/3051126.
```

```
Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, & Frankenstein, Susan. GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns. United States. doi:10.1145/3051126.
```

```
Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, and Frankenstein, Susan. Fri .
"GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns". United States. doi:10.1145/3051126. https://www.osti.gov/servlets/purl/1432619.
```

```
@article{osti_1432619,
```

title = {GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns},

author = {Senin, Pavel and Lin, Jessica and Wang, Xing and Oates, Tim and Gandhi, Sunil and Boedihardjo, Arnold P. and Chen, Crystal and Frankenstein, Susan},

abstractNote = {The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discovery that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.},

doi = {10.1145/3051126},

journal = {ACM Transactions on Knowledge Discovery from Data},

number = 1,

volume = 12,

place = {United States},

year = {2018},

month = {2}

}