skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Measuring the Interestingness of Articles in a Limited User Environment

Thesis/Dissertation ·
DOI:https://doi.org/10.2172/945553· OSTI ID:945553
 [1]
  1. Univ. of California, Los Angeles, CA (United States)

Search engines, such as Google, assign scores to news articles based on their relevancy to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevancy scores do not take into account what makes an article interesting, which varies from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A general framework, called iScore, is presented for defining and measuring the 'interestingness' of articles, incorporating user-feedback. iScore addresses various aspects of what makes an article interesting, such as topic relevancy, uniqueness, freshness, source reputation, and writing style. It employs various methods to measure these features and uses a classifier operating on these features to recommend articles. The basic iScore configuration is shown to improve recommendation results by as much as 20%. In addition to the basic iScore features, additional features are presented to address the deficiencies of existing feature extractors, such as one that tracks multiple topics, called MTT, and a version of the Rocchio algorithm that learns its parameters online as it processes documents, called eRocchio. The inclusion of both MTT and eRocchio into iScore is shown to improve iScore recommendation results by as much as 3.1% and 5.6%, respectively. Additionally, in TREC11 Adaptive Filter Task, eRocchio is shown to be 10% better than the best filter in the last run of the task. In addition to these two major topic relevancy measures, other features are also introduced that employ language models, phrases, clustering, and changes in topics to improve recommendation results. These additional features are shown to improve recommendation results by iScore by up to 14%. Due to varying reasons that users hold regarding why an article is interesting, an online feature selection method in naive Bayes is also introduced. Online feature selection can improve recommendation results in iScore by up to 18.9%. In summary, iScore in its best configuration can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg. iScore and its components are also evaluated in the TREC Adaptive Filter task using the Reuters RCV1 corpus.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48; AC52-07NA27344
OSTI ID:
945553
Report Number(s):
LLNL-TH-407797; TRN: US200903%%641
Country of Publication:
United States
Language:
English

Similar Records

Measuring the Interestingness of Articles in a Limited User Environment
Journal Article · Sat Jan 01 00:00:00 EST 2011 · Information Processing and Management · OSTI ID:945553

Measuring the Interestingness of Articles in a Limited User Environment Prospectus
Thesis/Dissertation · Wed Apr 18 00:00:00 EDT 2007 · OSTI ID:945553

Tracking Multiple Topics for Finding Interesting Articles
Conference · Thu Feb 15 00:00:00 EST 2007 · OSTI ID:945553