| | |
Summary: Localized Diffusion, Part II: Coarse-Grained Process
Guy Wolfa
, Aviv Rotbarta
, Gil Davidb
, Amir Averbucha,
aSchool of Computer Sciene, Tel Aviv University, Tel Aviv 69978, Israel
bDepartment of Mathematics, Program in Applied Mathematics, Yale University, New
Haven, CT 06510, USA
Abstract
Data-analysis methods nowadays are expected to deal with increasingly large
amounts of data. Such massive datasets often contain many redundancies. One
effect from these redundancies is the high-dimensionality of datasets, which is
handled by dimensionality reduction techniques. Another effect is the duplic-
ity of very similar observations (or data-points) that can be analyzed together
as a cluster. We propose an approach for dealing with both effects by coarse-
graining the popular Diffusion Maps (DM) dimensionality reduction framework
from the data-point level to the cluster level. This way, the size of the an-
alyzed dataset is decreased by only referring to clusters instead of individual
data-points. Then, the dimensionality of the dataset can be decreased by the
DM embedding. We show that the essential properties (e.g., ergodicity) of the
|