Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Toward a multi-sensor-based approach to automatic text classification

Technical Report ·
DOI:https://doi.org/10.2172/130610· OSTI ID:130610
 [1];  [2]
  1. Sacred Heart Univ., Fairfield, CT (United States)
  2. Oak Ridge National Lab., TN (United States)

Many automatic text indexing and retrieval methods use a term-document matrix that is automatically derived from the text in question. Latent Semantic Indexing is a method, recently proposed in the Information Retrieval (IR) literature, for approximating a large and sparse term-document matrix with a relatively small number of factors, and is based on a solid mathematical foundation. LSI appears to be quite useful in the problem of text information retrieval, rather than text classification. In this report, we outline a method that attempts to combine the strength of the LSI method with that of neural networks, in addressing the problem of text classification. In doing so, we also indicate ways to improve performance by adding additional {open_quotes}logical sensors{close_quotes} to the neural network, something that is hard to do with the LSI method when employed by itself. The various programs that can be used in testing the system with TIPSTER data set are described. Preliminary results are summarized, but much work remains to be done.

Research Organization:
Oak Ridge National Lab., TN (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States); Oak Ridge Inst. for Science and Education, TN (United States)
DOE Contract Number:
AC05-84OR21400
OSTI ID:
130610
Report Number(s):
ORNL/TM--13094; ON: DE96002202
Country of Publication:
United States
Language:
English

Similar Records

Toward a multi-sensor neural net approach to automatic text classification
Conference · Thu Jan 25 23:00:00 EST 1996 · OSTI ID:266901

Information fusion for automatic text classification
Conference · Thu Aug 01 00:00:00 EDT 1996 · OSTI ID:378178

Neural net learning issues in classification of free text documents
Conference · Thu Feb 29 23:00:00 EST 1996 · OSTI ID:212422