skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns

Abstract

The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discovery that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrentmore » and anomalous pattern discovery.« less

Authors:
ORCiD logo [1];  [2];  [2];  [3];  [3];  [4];  [4];  [4]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Univ. Of Hawai‘i at Mānoa, Honolulu, HI (United States). Collaborative Software Development Lab.
  2. George Mason Univ., Fairfax, VA (United States)
  3. Univ. of Maryland Baltimore County (UMBC), Baltimore, MD (United States)
  4. US Army Corps of Engineers (USACE) Engineer Research and Development Center (ERDC), Washington DC (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE; National Science Foundation (NSF)
OSTI Identifier:
1432619
Report Number(s):
LA-UR-17-21945
Journal ID: ISSN 1556-4681
Grant/Contract Number:
AC52-06NA25396; 1218325; 1218318
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
ACM Transactions on Knowledge Discovery from Data
Additional Journal Information:
Journal Volume: 12; Journal Issue: 1; Journal ID: ISSN 1556-4681
Publisher:
Association for Computing Machinery
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; information systems; data mining; time series; algorithms; software; interactive data mining

Citation Formats

Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, and Frankenstein, Susan. GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns. United States: N. p., 2018. Web. doi:10.1145/3051126.
Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, & Frankenstein, Susan. GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns. United States. doi:10.1145/3051126.
Senin, Pavel, Lin, Jessica, Wang, Xing, Oates, Tim, Gandhi, Sunil, Boedihardjo, Arnold P., Chen, Crystal, and Frankenstein, Susan. Fri . "GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns". United States. doi:10.1145/3051126.
@article{osti_1432619,
title = {GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns},
author = {Senin, Pavel and Lin, Jessica and Wang, Xing and Oates, Tim and Gandhi, Sunil and Boedihardjo, Arnold P. and Chen, Crystal and Frankenstein, Susan},
abstractNote = {The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discovery that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.},
doi = {10.1145/3051126},
journal = {ACM Transactions on Knowledge Discovery from Data},
number = 1,
volume = 12,
place = {United States},
year = {Fri Feb 23 00:00:00 EST 2018},
month = {Fri Feb 23 00:00:00 EST 2018}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on February 23, 2019
Publisher's Version of Record

Save / Share: