Method and system of filtering and recommending documents
Patent
·
OSTI ID:1237854
Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- Assignee:
- UT-Battelle LLC (Oak Ridge, TN)
- Patent Number(s):
- 9,256,649
- Application Number:
- 13/920,803
- OSTI ID:
- 1237854
- Country of Publication:
- United States
- Language:
- English
Similar Records
Automatic generation of stop word lists for information retrieval and analysis
TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams
Method and system to discover and recommend interesting documents
Patent
·
2013
·
OSTI ID:1082869
TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams
Conference
·
2005
·
OSTI ID:930754
Method and system to discover and recommend interesting documents
Patent
·
2017
·
OSTI ID:1341872