Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Exploring the nonlinear geometry of protein homology MICHAEL A. FARNUM, HUAFENG XU, AND DIMITRIS K. AGRAFIOTIS
 

Summary: Exploring the nonlinear geometry of protein homology
MICHAEL A. FARNUM, HUAFENG XU, AND DIMITRIS K. AGRAFIOTIS
3-Dimensional Pharmaceuticals, Inc., Exton, Pennsylvania 19341, USA
(RECEIVED March 12, 2003; FINAL REVISION May 10, 2003; ACCEPTED May 16, 2003)
Abstract
The explosion of biological data resulting from genomic and proteomic research has created a pressing need
for data analysis techniques that work effectively on a large scale. An area of particular interest is the
organization and visualization of large families of protein sequences. An increasingly popular approach is
to embed the sequences into a low-dimensional Euclidean space in a way that preserves some predefined
measure of sequence similarity. This method has been shown to produce maps that exhibit global order and
continuity and reveal important evolutionary, structural, and functional relationships between the embedded
proteins. However, protein sequences are related by evolutionary pathways that exhibit highly nonlinear
geometry, which is invisible to classical embedding procedures such as multidimensional scaling (MDS)
and nonlinear mapping (NLM). Here, we describe the use of stochastic proximity embedding (SPE) for
producing Euclidean maps that preserve the intrinsic dimensionality and metric structure of the data. SPE
extends previous approaches in two important ways: (1) It preserves only local relationships between closely
related sequences, thus allowing the map to unfold and reveal its intrinsic dimension, and (2) it scales
linearly with the number of sequences and therefore can be applied to very large protein families. The merits
of the algorithm are illustrated using examples from the protein kinase and nuclear hormone receptor
superfamilies.

  

Source: Agrafiotis, Dimitris K. - Molecular Design and Informatics Group, Johnson & Johnson Pharmaceutical Research and Development

 

Collections: Chemistry; Computer Technologies and Information Sciences