skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Automated discovery of active motifs in multiple RNA secondary structures

Conference ·
OSTI ID:421258
;  [1];  [2]
  1. New Jersey Inst. of Technology, Newark, NJ (United States)
  2. National Inst. of Health, Frederick, MD (United States); and others

In this paper we present a method for discovering approximately common motifs (also known as active motifs) in multiple RNA secondary structures. The secondary structures can be represented as ordered trees (i.e., the order among siblings matters). Motifs in these trees are connected subgraphs that can differ in both substitutions and deletions/insertions. The proposed method consists of two steps: (1) find candidate motifs in a small sample of the secondary structures; (2) search all of the secondary structures to determine how frequently these motifs occur (within the allowed approximation) in the secondary structures. To reduce the running time, we develop two optimization heuristics based on sampling and pattern matching techniques. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. To demonstrate the utility of our algorithms, we discuss their applications to conducting the phylogenetic study of RNA sequences obtained from GenBank.

OSTI ID:
421258
Report Number(s):
CONF-960830-; TRN: 96:005928-0013
Resource Relation:
Conference: 2. international conference on knowledge discovery and data mining, Portland, OR (United States), 2-4 Aug 1996; Other Information: PBD: 1996; Related Information: Is Part Of Proceedings of the second international conference on knowledge discovery & data mining; Simoudis, E.; Han, J.; Fayyad, U. [eds.]; PB: 405 p.
Country of Publication:
United States
Language:
English