Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric
- Univ. of Rennes 1 (France); National Inst. of Research in Computer Science and Automation (INRIA), Rennes (France)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- National Research Inst. for Mathematics and Computer Science (CWI), Amsterdam (Netherlands). Life Sciences
- Univ. of Rennes 1 (France); National Inst. of Research in Computer Science and Automation (INRIA), Rennes (France)
- Univ. of Duisburg-Essen, Essen (Germany). Genome Informatics; Univ. of Lubeck (Germany). Inst. of Neurogenetics and for Integrative and Experimental Genomics. Platform for Genome Analytics
In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifies up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE
- Contributing Organization:
- Univ. of Rennes 1 (France); National Inst. of Research in Computer Science and Automation (INRIA); National Research Inst. for Mathematics and Computer Science (CWI); Univ. of Duisburg-Essen, Essen (Germany); Univ. of Lubeck (Germany)
- Grant/Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1329875
- Report Number(s):
- LA-UR--15-24867
- Journal Information:
- Algorithms, Journal Name: Algorithms Journal Issue: 4 Vol. 8; ISSN 1999-4893
- Publisher:
- MDPICopyright Statement
- Country of Publication:
- United States
- Language:
- English
| Difference contact maps: from what to why in the analysis of the conformational flexibility of proteins 
 | journal | March 2019 | 
| QUBO formulation for the contact map overlap problem 
 | journal | December 2018 | 
| Difference contact maps: From what to why in the analysis of the conformational flexibility of proteins 
 | journal | March 2020 | 
Similar Records
Solvent accessible surface representation in a database system for protein docking
A spectral metric for collider geometry
