| | |
Summary: Automated classification of A/E/C web content
R. Amor & K. Xu
Department of Computer Science, University of Auckland, Auckland, New Zealand
1 INTRODUCTION
In this paper the adaptation of a standard informa-
tion retrieval technique, namely latent semantic in-
dexing, is examined for a domain specific search en-
gine. The premise behind this approach is that it is
possible to accurately identify classification codes
related to the content of the web page or web site. If
content can be accurately classified then a user
searching for content in a particular area (e.g. by
specifying a classification code) will be presented
only with highly relevant web information.
The reason that we attempt to classify to a stan-
dard classification code is that these are used and
understood by the vast majority of professionals
within the A/E/C industries. Because a classification
code has a well described scope it is likely to be un-
derstood similarly by professionals from many dis-
|