New approach to protein fold recognition based on Delaunay tessellation of protein structure
Abstract
We propose new algorithms for sequence-structure compatibility (fold recognition) searches in multidimensional sequence-structure space. Individual amino acid residues in protein structures are represented by their C{sup {alpha}} atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of a protein generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of quadruplet residue compositions of all Delaunay simplices in a representative dataset of protein structures leads to a novel four body contact residue potential expressed as log likelihood factor q. The q factors are calculated for native 20 letter amino acid alphabet and several reduced alphabets. Two sequence structure compatibility functions are computed as (i) the sum of q factors for all Delaunay simplices in a given protein, or (ii) 3D-1D Delaunay tessellation profiles where the individual residue profile value is calculated as the sum of q factors for all simplices that share this vertex residue. Both threading functions have been implemented in structure-recognizes-sequence and sequence-recognizes-structure protocols for protein fold recognition. We find that both profile and total score based threading functions can distinguish both the native fold from incorrect folds for a sequence, and the native sequence from non-native sequences formore »
- Authors:
-
- Univ. of North Carolina, Chapel Hill, NC (United States)
- Publication Date:
- OSTI Identifier:
- 549278
- Report Number(s):
- CONF-970132-
TRN: 97:005592-0048
- Resource Type:
- Conference
- Resource Relation:
- Conference: Pacific symposium on biocomputing `97, Kapalua, HI (United States), 6-9 Jan 1997; Other Information: PBD: 1996; Related Information: Is Part Of Pacific symposium on biocomputing `97: Proceedings; Altman, R.B. [ed.] [Stanford Univ., CA (United States). Section on Medical Informatics]; Dunker, A.K. [ed.] [Washington State Univ., Pullman, WA (United States). Dept. of Biochemistry and Biophysics]; Hunter, L. [ed.] [National Insts. of Health, Bethesda, MD (United States). National Library of Medicine]; Klein, T.E. [ed.] [California Univ., San Francisco, CA (United States). Dept. of Pharmaceutical Chemistry]; PB: 508 p.
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 55 BIOLOGY AND MEDICINE, BASIC STUDIES; 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; DNA SEQUENCING; COMPATIBILITY; PROTEIN STRUCTURE; ALGORITHMS; THREE-DIMENSIONAL CALCULATIONS; NUMERICAL ANALYSIS; MATHEMATICAL MODELS; STRUCTURAL MODELS; PROTEINS; STRUCTURE-ACTIVITY RELATIONSHIPS; ELECTRONIC STRUCTURE; AMINO ACIDS; STATISTICS; ELECTRIC POTENTIAL
Citation Formats
Zheng, W, Cho, S J, Vaisman, I I, and Tropsha, A. New approach to protein fold recognition based on Delaunay tessellation of protein structure. United States: N. p., 1996.
Web.
Zheng, W, Cho, S J, Vaisman, I I, & Tropsha, A. New approach to protein fold recognition based on Delaunay tessellation of protein structure. United States.
Zheng, W, Cho, S J, Vaisman, I I, and Tropsha, A. Tue .
"New approach to protein fold recognition based on Delaunay tessellation of protein structure". United States.
@article{osti_549278,
title = {New approach to protein fold recognition based on Delaunay tessellation of protein structure},
author = {Zheng, W and Cho, S J and Vaisman, I I and Tropsha, A},
abstractNote = {We propose new algorithms for sequence-structure compatibility (fold recognition) searches in multidimensional sequence-structure space. Individual amino acid residues in protein structures are represented by their C{sup {alpha}} atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of a protein generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of quadruplet residue compositions of all Delaunay simplices in a representative dataset of protein structures leads to a novel four body contact residue potential expressed as log likelihood factor q. The q factors are calculated for native 20 letter amino acid alphabet and several reduced alphabets. Two sequence structure compatibility functions are computed as (i) the sum of q factors for all Delaunay simplices in a given protein, or (ii) 3D-1D Delaunay tessellation profiles where the individual residue profile value is calculated as the sum of q factors for all simplices that share this vertex residue. Both threading functions have been implemented in structure-recognizes-sequence and sequence-recognizes-structure protocols for protein fold recognition. We find that both profile and total score based threading functions can distinguish both the native fold from incorrect folds for a sequence, and the native sequence from non-native sequences for a fold. 25 refs., 4 figs., 1 tab.},
doi = {},
url = {https://www.osti.gov/biblio/549278},
journal = {},
number = ,
volume = ,
place = {United States},
year = {1996},
month = {12}
}