| | |
Summary: Theseus: Categorization by Context
Giuseppe Attardi
Dipartimento di Informatica
Università di Pisa, Italy
attardi@di.unipi.it
Antonio Gullì
Ideare srl
Pisa, Italy
gulli@ideare.com
Fabrizio Sebastiani
Istituto di Elaborazione
dell'Informazione, Pisa, Italy
fabrizio@iei.pi.cnr.it.it
1. Introduction
The traditional approach to document categorization
is categorization by content, since information for
categorizing a document is extracted from the
document itself.
In a hypertext environment like the Web, the
structure of documents and the link topology can be
|