Complete fold annotation of the human proteome using a novel structural feature space
- Univ. of Pennsylvania, Philadelphia, PA (United States). Genomics and Computational Biology Program
- Univ. of Pennsylvania, Philadelphia, PA (United States). Dept. of Computer Science
- Univ. of Pennsylvania, Philadelphia, PA (United States). Genomics and Computational Biology Program; Univ. of Pennsylvania, Philadelphia, PA (United States). Dept. of Biology
Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Finally, our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.
- Research Organization:
- Krell Institute, Ames, IA (United States)
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- FG02-97ER25308
- OSTI ID:
- 1366516
- Journal Information:
- Scientific Reports, Vol. 7; ISSN 2045-2322
- Publisher:
- Nature Publishing GroupCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Comprehensive catalog of dendritically localized mRNA isoforms from sub-cellular sequencing of single mouse neurons
|
journal | January 2019 |
Comprehensive catalog of dendritically localized mRNA isoforms from sub-cellular sequencing of single mouse neurons
|
posted_content | March 2018 |
Similar Records
Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology
Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium