skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Identification of protein motifs using conserved amino acid properties and partitioning techniques

Abstract

Analyzing a set of protein sequences involves a fundamental relationship between the coherency of the set and the specificity of the motif that describes it. Motifs may be obscured by training sets that contain incoherent sequences, in part due to protein subclasses, contamination, or errors. We develop an algorithm for motif identification that systematically explores possible patterns of coherency within a set of protein sequences, Our algorithm constructs alternative partitions of the training set data, where one subset of each partition is presumed to contain coherent data and is used for forming a motif. The motif is represented by multiple overlapping amino acid groups based on evolutionary, biochemical, or physical properties. We demonstrate our method on a training set of reverse transcriptases that contains subclasses, sequence errors, misalignments, and contaminating sequences. Despite these complications, our program identifies a novel motif for the subclass of retroviral and retrovirus-related reverse transcriptases. This motif has a much higher specificity than previously reported motifs and suggests the importance of conserved hydrophilic and hydrophobic residues in the structure of reverse transcriptases.

Authors:
;  [1]
  1. Stanford Univ., CA (United States)
Publication Date:
Research Org.:
Stanford Univ., CA (United States)
OSTI Identifier:
401870
Report Number(s):
CONF-9507246-
TRN: 96:005602-0048
Resource Type:
Technical Report
Resource Relation:
Conference: Intelligent Systems for Molecular Biology (ISMB) conference, Cambridge (United Kingdom), 16-19 Jul 1995; Other Information: PBD: 1995; Related Information: Is Part Of ISMB-95 -- Third international conference on intelligent systems for molecular biology: Proceedings; Rawlings, C.; Clark, D.; Altman, R.; Hunter, L.; Lengauer, T.; Wodak, S. [eds.]; PB: 427 p.
Country of Publication:
United States
Language:
English
Subject:
55 BIOLOGY AND MEDICINE, BASIC STUDIES; 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; PROTEINS; S CODES; ERRORS; RESIDUES; AMINO ACIDS; ALGORITHMS; GENETIC MAPPING

Citation Formats

Wu, T D, and Brutlag, D L. Identification of protein motifs using conserved amino acid properties and partitioning techniques. United States: N. p., 1995. Web.
Wu, T D, & Brutlag, D L. Identification of protein motifs using conserved amino acid properties and partitioning techniques. United States.
Wu, T D, and Brutlag, D L. 1995. "Identification of protein motifs using conserved amino acid properties and partitioning techniques". United States.
@article{osti_401870,
title = {Identification of protein motifs using conserved amino acid properties and partitioning techniques},
author = {Wu, T D and Brutlag, D L},
abstractNote = {Analyzing a set of protein sequences involves a fundamental relationship between the coherency of the set and the specificity of the motif that describes it. Motifs may be obscured by training sets that contain incoherent sequences, in part due to protein subclasses, contamination, or errors. We develop an algorithm for motif identification that systematically explores possible patterns of coherency within a set of protein sequences, Our algorithm constructs alternative partitions of the training set data, where one subset of each partition is presumed to contain coherent data and is used for forming a motif. The motif is represented by multiple overlapping amino acid groups based on evolutionary, biochemical, or physical properties. We demonstrate our method on a training set of reverse transcriptases that contains subclasses, sequence errors, misalignments, and contaminating sequences. Despite these complications, our program identifies a novel motif for the subclass of retroviral and retrovirus-related reverse transcriptases. This motif has a much higher specificity than previously reported motifs and suggests the importance of conserved hydrophilic and hydrophobic residues in the structure of reverse transcriptases.},
doi = {},
url = {https://www.osti.gov/biblio/401870}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Dec 31 00:00:00 EST 1995},
month = {Sun Dec 31 00:00:00 EST 1995}
}

Technical Report:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that may hold this item. Keep in mind that many technical reports are not cataloged in WorldCat.

Save / Share: