Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Stochastic motif extraction using hidden Markov model

Technical Report ·
OSTI ID:377135
; ;  [1]
  1. Massively Parallel Systems NEC Lab., Kawasaki, Kanagawa (Japan)

In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

Research Organization:
Stanford Univ., CA (United States)
OSTI ID:
377135
Report Number(s):
CONF-9408117--
Country of Publication:
United States
Language:
English

Similar Records

Supervised learning of hidden Markov models for sequence discrimination
Conference · Sun Nov 30 23:00:00 EST 1997 · OSTI ID:549015

The N-terminal leucine-zipper motif in PTRF/cavin-1 is essential and sufficient for its caveolae-association
Journal Article · Thu Jan 15 23:00:00 EST 2015 · Biochemical and Biophysical Research Communications · OSTI ID:22416898

Behavior Detection using Confidence Intervals of Hidden Markov Models
Journal Article · Wed Dec 31 23:00:00 EST 2008 · IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics · OSTI ID:964702