Web document clustering using hyperlink structures

He, Xiaofeng; Zha, Hongyuan; Ding, Chris H Q; Simon, Horst D

doi:10.2172/815474

Web document clustering using hyperlink structures

Technical Report · Mon May 07 04:00:00 EDT 2001

DOI:https://doi.org/10.2172/815474· OSTI ID:815474

He, Xiaofeng ^[1]; Zha, Hongyuan; Ding, Chris H Q; Simon, Horst D

LBNL Library

With the exponential growth of information on the World Wide Web there is great demand for developing efficient and effective methods for organizing and retrieving the information available. Document clustering plays an important role in information retrieval and taxonomy management for the World Wide Web and remains an interesting and challenging problem in the field of web computing. In this paper we consider document clustering methods exploring textual information hyperlink structure and co-citation relations. In particular we apply the normalized cut clustering method developed in computer vision to the task of hyperdocument clustering. We also explore some theoretical connections of the normalized-cut method to K-means method. We then experiment with normalized-cut method in the context of clustering query result sets for web search engines.

Research Organization:: Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)

Sponsoring Organization:: USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division (US)

DOE Contract Number:: AC03-76SF00098

OSTI ID:: 815474

Report Number(s):: LBNL--47971

Country of Publication:: United States

Language:: English

Similar Records

Querying the World Wide Web

Conference · Mon Dec 30 23:00:00 EST 1996 · OSTI ID:535544

Thematic World Wide Web Visualization System

Software · Thu Oct 10 00:00:00 EDT 1996 · OSTI ID:1230362

Web document engineering

Conference · Wed May 01 00:00:00 EDT 1996 · OSTI ID:489679

Related Subjects

99 GENERAL AND MISCELLANEOUS
COMPUTERS
ENGINES
INFORMATION RETRIEVAL
MANAGEMENT
ORGANIZING
TAXONOMY
VISION

Web document clustering using hyperlink structures

Citation Formats

Similar Records

Related Subjects