DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Method and system of filtering and recommending documents

Abstract

Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.

Inventors:
;
Issue Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1237854
Patent Number(s):
9256649
Application Number:
13/920,803
Assignee:
UT-Battelle LLC (Oak Ridge, TN)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Patent
Resource Relation:
Patent File Date: 2013 Jun 18
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 99 GENERAL AND MISCELLANEOUS

Citation Formats

Patton, Robert M., and Potok, Thomas E. Method and system of filtering and recommending documents. United States: N. p., 2016. Web.
Patton, Robert M., & Potok, Thomas E. Method and system of filtering and recommending documents. United States.
Patton, Robert M., and Potok, Thomas E. Tue . "Method and system of filtering and recommending documents". United States. https://www.osti.gov/servlets/purl/1237854.
@article{osti_1237854,
title = {Method and system of filtering and recommending documents},
author = {Patton, Robert M. and Potok, Thomas E.},
abstractNote = {Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2016},
month = {2}
}

Works referenced in this record:

TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams
conference, December 2006


An implementation of a knowledge recommendation system based on similarity among users' profiles
conference, January 2002


A vector space model for automatic indexing
journal, November 1975


Engene: A genetic algorithm classifier for content-based recommender systems that does not require continuous user feedback
conference, September 2010


A statistical interpretation of term specificity and its application in retrieval
journal, October 2004


Classification of clustered documents based on similarity scores
patent, September 2013