| | |
Summary: A Practical Clustering Algorithm
for Static and Dynamic Information Organization \Lambda
Javed Aslam Katya Pelekhov Daniela Rus
Dartmouth College y
Abstract
We present and analyze the offline star algorithm for clus
tering static information systems and the online star algo
rithm for clustering dynamic information systems. These
algorithms organize a document collection into a number of
clusters that is naturally induced by the collection via a com
putationally efficient cover by dense subgraphs. We further
show a lower bound on the accuracy of the clusters produced
by these algorithms as well as demonstrate that these algo
rithms are efficient (running times roughly linear in the size
of the problem). Finally, we provide data from a number of
experiments.
1 Introduction
We wish to create more versatile information capture
and access systems for digital libraries by using infor
mation organization: thousands of electronic documents
|