| | |
Summary: On the Use of Information Theory for Assessing Molecular Diversity
Dimitris K. Agrafiotis
3-Dimensional Pharmaceuticals, Inc., 665 Stockton Drive, Suite 104, Exton, Pennsylvania 19341
Received November 13, 1996X
In a recent article published in Molecules, Lin presented a novel approach for assessing molecular diversity
based on Shannon's information theory. In this method, a set of compounds is viewed as a static collection
of microstates which can register information about their environment at some predetermined capacity.
Diversity is directly related to the information conveyed by the population, as quantified by Shannon's
classical entropy equation. Despite its intellectual appeal, this method is characterized by a strong tendency
to oversample remote areas of the feature space and produce unbalanced designs. This paper demonstrates
this limitation with some simple examples and provides a rationale for the failure of the method to produce
results that are consistent with other traditional methodologies.
INTRODUCTION
In a recent article published in Molecules,1
Lin proposed
a new method for assessing molecular diversity based on
the principles of information theory, as it was first formalized
by Shannon.2 In Lin's method, a collection of compounds
is viewed as a static molecular assemblage or collection of
microstates which can register information about their
|