skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Method and system of filtering and recommending documents

Patent ·
OSTI ID:1237854

Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
Assignee:
UT-Battelle LLC (Oak Ridge, TN)
Patent Number(s):
9,256,649
Application Number:
13/920,803
OSTI ID:
1237854
Resource Relation:
Patent File Date: 2013 Jun 18
Country of Publication:
United States
Language:
English

References (7)

TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams conference December 2006
An implementation of a knowledge recommendation system based on similarity among users' profiles conference January 2002
A vector space model for automatic indexing journal November 1975
Engene: A genetic algorithm classifier for content-based recommender systems that does not require continuous user feedback conference September 2010
A statistical interpretation of term specificity and its application in retrieval journal October 2004
Method and system for optimally searching a document database using a representative semantic space patent January 2009
Classification of clustered documents based on similarity scores patent September 2013

Similar Records

Automatic generation of stop word lists for information retrieval and analysis
Patent · Tue Jan 08 00:00:00 EST 2013 · OSTI ID:1237854

VisIRR: A Visual Analytics System for Information Retrieval and Recommendation for Large-Scale Document Data
Journal Article · Wed Jan 31 00:00:00 EST 2018 · ACM Transactions on Knowledge Discovery from Data · OSTI ID:1237854

TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams
Conference · Sun Jan 01 00:00:00 EST 2006 · OSTI ID:1237854