Web document clustering using hyperlink structures
- LBNL Library
With the exponential growth of information on the World Wide Web there is great demand for developing efficient and effective methods for organizing and retrieving the information available. Document clustering plays an important role in information retrieval and taxonomy management for the World Wide Web and remains an interesting and challenging problem in the field of web computing. In this paper we consider document clustering methods exploring textual information hyperlink structure and co-citation relations. In particular we apply the normalized cut clustering method developed in computer vision to the task of hyperdocument clustering. We also explore some theoretical connections of the normalized-cut method to K-means method. We then experiment with normalized-cut method in the context of clustering query result sets for web search engines.
- Research Organization:
- Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
- Sponsoring Organization:
- USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division (US)
- DOE Contract Number:
- AC03-76SF00098
- OSTI ID:
- 815474
- Report Number(s):
- LBNL--47971
- Country of Publication:
- United States
- Language:
- English
Similar Records
Querying the World Wide Web
Thematic World Wide Web Visualization System
Web document engineering
Conference
·
Mon Dec 30 23:00:00 EST 1996
·
OSTI ID:535544
Thematic World Wide Web Visualization System
Software
·
Thu Oct 10 00:00:00 EDT 1996
·
OSTI ID:1230362
Web document engineering
Conference
·
Wed May 01 00:00:00 EDT 1996
·
OSTI ID:489679