Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
A Constant Time Algorithm for Estimating the Diversity of Large Chemical Libraries Dimitris K. Agrafiotis
 

Summary: A Constant Time Algorithm for Estimating the Diversity of Large Chemical Libraries
Dimitris K. Agrafiotis
3-Dimensional Pharmaceuticals, Inc., 665 Stockton Drive, Exton, Pennsylvania 19341
Received July 11, 2000
We describe a novel diversity metric for use in the design of combinatorial chemistry and high-throughput
screening experiments. The method estimates the cumulative probability distribution of intermolecular
dissimilarities in the collection of interest and then measures the deviation of that distribution from the
respective distribution of a uniform sample using the Kolmogorov-Smirnov statistic. The distinct advantage
of this approach is that the cumulative distribution can be easily estimated using probability sampling and
does not require exhaustive enumeration of all pairwise distances in the data set. The function is intuitive,
very fast to compute, does not depend on the size of the collection, and can be used to perform diversity
estimates on both global and local scale. More importantly, it allows meaningful comparison of data sets of
different cardinality and is not affected by the curse of dimensionality, which plagues many other diversity
indices. The advantages of this approach are demonstrated using examples from the combinatorial chemistry
literature.
INTRODUCTION
The measurement of molecular diversity has become an
issue of heated debate in recent years.1-4 Although the
concept is widely employed in the design of combinatorial
libraries, it has been surprisingly difficult to define, both

  

Source: Agrafiotis, Dimitris K. - Molecular Design and Informatics Group, Johnson & Johnson Pharmaceutical Research and Development

 

Collections: Chemistry; Computer Technologies and Information Sciences