| | |
Summary: Journal of Molecular Graphics and Modelling 22 (2004) 275284
A method for quantifying and visualizing the
diversity of QSAR models
Sergei Izrailev, Dimitris K. Agrafiotis
3-Dimensional Pharmaceuticals, Inc., 8 Clarke Drive, Cranbury, NJ 08512, USA
Received 21 June 2003; received in revised form 10 October 2003; accepted 13 October 2003
Abstract
Feature selection is one of the most commonly used and reliable methods for deriving predictive quantitative structureactivity re-
lationships (QSAR). Many feature selection algorithms are stochastic in nature and often produce different solutions depending on the
initialization conditions. Because some features may be highly correlated, models that are based on different sets of descriptors may capture
essentially the same information, however, such models are difficult to recognize. Here, we introduce a measure of similarity between
QSAR models that captures the correlation between the underlying features. This measure can be used in conjunction with stochastic
proximity embedding (SPE) or multi-dimensional scaling (MDS) to create a meaningful visual representation of structureactivity model
space and aid in the post-processing and analysis of results of feature selection calculations.
© 2003 Elsevier Inc. All rights reserved.
Keywords: Stochastic proximity embedding; Multi-dimensional scaling; Nonlinear mapping; Feature selection; Point set similarity; Quantitative
structureactivity relationships; Data mining
1. Introduction
Quantitative structureactivity relationships (QSAR) are
mathematical models that relate the biological activity of a
|