Summary: UNIVERSITY OF REGINA
Department of Mathematics and Statistics
Graduate Student Seminar
Speaker: Yuxin (Sheena) Zhang
Date: 06 April 2006
Time: 2.30 o'clock
Location: College West 307.18 (Math & Stats Lounge)
Title: Estimating the number of clusters via the Local Gap statistic
Abstract: Robert Tibshirani proposed a unsupervised gap method in "Estimating the
Number of Clusters in a Dataset via the Gap Statistic" in 2001. He chose the unifor-
mity hypothesis to create a reference null distribution and considered two approaches for
constructing the region of support of the distribution.
We propose "Local gap statistics" (LGS) for estimating the clusters in a set of data.
LGS gives a local criteria to detect if a set of data can be separate or not by using the Local
gap statistics measurement, which is compared to the change rate of its relative average
distance in the clustered data set standardized by an generated appropriate reference null
distribution. This technique can use any kind of clustering algorithm.
Compared to the Gap Statistics method, LGS is able to be used in a parallel computing
systems, and LGS can detect clusters with different density.
Supervisor: D. Deng