Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Practical challenges that arise when clustering the web using spectral methods

Summary: Practical challenges that arise when clustering the
web using spectral methods
Argimiro Arratia 1 and Carlos Mariju´an 2
Llenguatges i Sistemes Inform`atics, Universitat Polit`ecnica de Catalunya, Spain,
Matem´atica Aplicada, Universidad de Valladolid, Spain, marijuan@mat.uva.es
Abstract. This is a report on an implementation of a spectral clustering algorithm
for classifying very large internet sites, with special emphasis on the practical prob-
lems encountered in developing such a data mining system. Remarkably some of these
technical difficulties are due to fundamental issues pertaining to the mathematics in-
volved, and are not treated properly in the literature. Others are inherent to the
functions and numerical methods proper to the high level technical computing pro-
gramming environment that we use. We will point out what these practical challenges
are and how to solve them.
Key words: spectral, clustering, internet
1 Introduction
Spectral clustering is a technique for partitioning data based on the spec-
trum of a similarity matrix: a matrix which registers some pairwise similarity


Source: Arratia, Argimiro A. - Departamento de Matemática Aplicada, Universidad de Valladolid


Collections: Mathematics