skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Complete fold annotation of the human proteome using a novel structural feature space

Journal Article · · Scientific Reports
DOI:https://doi.org/10.1038/srep46321· OSTI ID:1366516
 [1];  [2];  [3]
  1. Univ. of Pennsylvania, Philadelphia, PA (United States). Genomics and Computational Biology Program
  2. Univ. of Pennsylvania, Philadelphia, PA (United States). Dept. of Computer Science
  3. Univ. of Pennsylvania, Philadelphia, PA (United States). Genomics and Computational Biology Program; Univ. of Pennsylvania, Philadelphia, PA (United States). Dept. of Biology

Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Finally, our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.

Research Organization:
Krell Institute, Ames, IA (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
FG02-97ER25308
OSTI ID:
1366516
Journal Information:
Scientific Reports, Vol. 7; ISSN 2045-2322
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 3 works
Citation information provided by
Web of Science

References (34)

Deep learning journal May 2015
Probabilistic expression of spatially varied amino acid dimers into general form of Chou׳s pseudo amino acid composition for protein fold recognition journal September 2015
A conditional neural fields model for protein threading journal June 2012
Assessment of template-based protein structure predictions in CASP10: CASP10 TBM Assessment journal January 2014
Protein fold recognition using geometric kernel data fusion journal March 2014
Structural Genomics of Minimal Organisms and Protein Fold Space journal September 2005
Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology journal March 2007
Improving Protein Fold Recognition by Deep Learning Networks journal December 2015
Template-based protein structure modeling using the RaptorX web server journal July 2012
I-TASSER: a unified platform for automated protein structure and function prediction journal March 2010
A machine learning information retrieval approach to protein fold recognition journal March 2006
Input space versus feature space in kernel-based methods journal January 1999
The Proteome Folding Project: Proteome-scale prediction of structure and function journal August 2011
BLAST+: architecture and applications journal January 2009
The structure of the protein universe and genome evolution journal November 2002
A new gene, EVC2, is mutated in Ellis–van Creveld syndrome journal December 2002
Protein threading using context-specific alignment potential journal June 2013
Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates journal June 2011
Novel and recurrent EVC and EVC2 mutations in Ellis-van Creveld syndrome and Weyers acrofacial dyostosis journal February 2013
Recognition of a protein fold in the context of the SCOP classification journal June 1999
Fast and accurate automatic structure prediction with HHpred journal January 2009
Improving taxonomy-based protein fold recognition by using global and local features: Protein Fold Recognition by TAXFOLD journal May 2011
Identification of related proteins on family, superfamily and fold level 1 1Edited by F. C. Cohen journal January 2000
Multi-class protein fold recognition using support vector machines and neural networks journal April 2001
A census of human RNA-binding proteins journal November 2014
NoFold: RNA structure clustering without folding or alignment journal September 2014
Protein superfamilles and domain superfolds journal December 1994
SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures journal December 2013
The HHpred interactive server for protein homology detection and structure prediction journal July 2005
A Segmentation-Based Method to Extract Structural and Evolutionary Features for Protein Fold Recognition journal May 2014
RBPPred: predicting RNA-binding proteins from sequence using SVM journal December 2016
Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique journal September 2015
Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models journal October 2015
A Novel RNA-Binding Protein, Ossa/C9orf10, Regulates Activity of Src Kinases To Protect Cells from Oxidative Stress-Induced Apoptosis journal November 2008

Cited By (2)


Similar Records

Related Subjects