Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Stochastic Proximity Embedding DIMITRIS K. AGRAFIOTIS

Summary: Stochastic Proximity Embedding
3-Dimensional Pharmaceuticals, Inc., 665 Stockton Drive, Exton, Pennsylvania 19341
Received 13 September 2002; Accepted 5 November 2002
Abstract: We introduce stochastic proximity embedding (SPE), a novel self-organizing algorithm for producing
meaningful underlying dimensions from proximity data. SPE attempts to generate low-dimensional Euclidean embed-
dings that best preserve the similarities between a set of related observations. The method starts with an initial
configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates
so that their distances on the map match more closely their respective proximities. The magnitude of these adjustments
is controlled by a learning rate parameter, which decreases during the course of the simulation to avoid oscillatory
behavior. Unlike classical multidimensional scaling (MDS) and nonlinear mapping (NLM), SPE scales linearly with
respect to sample size, and can be applied to very large data sets that are intractable by conventional embedding
procedures. The method is programmatically simple, robust, and convergent, and can be applied to a wide range of
scientific problems involving exploratory data analysis and visualization.
2003 Wiley Periodicals, Inc. J Comput Chem 24: 12151221, 2003
Key words: stochastic proximity embedding; multidimensional scaling; nonlinear mapping; Sammon mapping;
stochastic descent; self-organizing; dimensionality reduction; feature extraction; combinatorial chemistry; data
mining; data analysis; pattern recognition; molecular descriptor; molecular similarity; molecular diversity
Converting distances to coordinates is a pervasive theme in many


Source: Agrafiotis, Dimitris K. - Molecular Design and Informatics Group, Johnson & Johnson Pharmaceutical Research and Development


Collections: Chemistry; Computer Technologies and Information Sciences