DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: VisIRR: A Visual Analytics System for Information Retrieval and Recommendation for Large-Scale Document Data

Journal Article · · ACM Transactions on Knowledge Discovery from Data
DOI: https://doi.org/10.1145/3070616 · OSTI ID:1426558
 [1];  [2];  [2];  [3];  [4];  [5];  [4]; ORCiD logo [6];  [7];  [2];  [2]
  1. Korea University, Seoul (South Korea)
  2. Georgia Inst. of Technology, Atlanta, GA (United States)
  3. Adobe Research, Seattle, WA (United States)
  4. Google Inc., Mountain View, CA (United States)
  5. Oregon State University, Corvallis, OR (United States)
  6. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  7. Southwestern University, Georgetown, TX (United States)

In this paper, we present an interactive visual information retrieval and recommendation system, called VisIRR, for large-scale document discovery. VisIRR effectively combines the paradigms of (1) a passive pull through query processes for retrieval and (2) an active push that recommends items of potential interest to users based on their preferences. Equipped with an efficient dynamic query interface against a large-scale corpus, VisIRR organizes the retrieved documents into high-level topics and visualizes them in a 2D space, representing the relationships among the topics along with their keyword summary. In addition, based on interactive personalized preference feedback with regard to documents, VisIRR provides document recommendations from the entire corpus, which are beyond the retrieved sets. Such recommended documents are visualized in the same space as the retrieved documents, so that users can seamlessly analyze both existing and newly recommended ones. This article presents novel computational methods, which make these integrated representations and fast interactions possible for a large-scale document corpus. We illustrate how the system works by providing detailed usage scenarios. Finally, we present preliminary user study results for evaluating the effectiveness of the system.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1426558
Journal Information:
ACM Transactions on Knowledge Discovery from Data, Journal Name: ACM Transactions on Knowledge Discovery from Data Journal Issue: 1 Vol. 12; ISSN 1556-4681
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English

References (32)

Rapid understanding of scientific paper collections: Integrating statistics, text analytics, and visualization
  • Dunne, Cody; Shneiderman, Ben; Gove, Robert
  • Journal of the American Society for Information Science and Technology, Vol. 63, Issue 12 https://doi.org/10.1002/asi.22652
journal November 2012
The procrustes program: Producing direct rotation to test a hypothesized factor structure journal April 1962
The Hungarian method for the assignment problem journal March 1955
A Procrustes problem on the Stiefel manifold journal June 1999
Information foraging. journal October 1999
The heat kernel as the pagerank of a graph journal December 2007
IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use journal January 1995
Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis journal May 2007
Fast algorithm for detecting community structure in networks journal June 2004
Finding facts vs. browsing knowledge in hypertext systems journal January 1988
Visualizing the non-visual: spatial analysis and interaction with information from text documents conference January 1995
Generalizing discriminant analysis using the generalized singular value decomposition journal August 2004
An Insight-Based Methodology for Evaluating Bioinformatics Visualizations journal July 2005
Promoting Insight-Based Evaluation of Visualizations: From Contest to Benchmark Repository journal January 2008
UTOPIAN: User-Driven Topic Modeling Based on Interactive Nonnegative Matrix Factorization journal December 2013
Two-stage framework for visualization of clustered high dimensional data conference October 2009
iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction conference October 2010
iVisClustering: An Interactive Visual Document Clustering via Topic Modeling journal June 2012
An interactive visual testbed system for dimension reduction and clustering of large-scale high-dimensional data conference February 2013
Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons journal January 2011
ArnetMiner: extraction and mining of academic social networks conference January 2008
Apolo: making sense of large network data by combining rich user interaction and machine learning conference January 2011
Beyond keyword search: discovering relevant scientific literature conference January 2011
Collaborative topic modeling for recommending scientific articles conference January 2011
Semantic interaction for visual text analytics conference January 2012
Representing documents through their readers conference January 2013
A biterm topic model for short texts conference January 2013
Computational models of information scent-following in a very large browsable text collection conference January 1997
Probabilistic latent semantic indexing
  • Hofmann, Thomas
  • Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval , p. 50-57 https://doi.org/10.1145/312624.312649
conference January 1999
Document clustering using word clusters via the information bottleneck method
  • Slonim, Noam; Tishby, Naftali
  • Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00 https://doi.org/10.1145/345508.345578
conference January 2000
Document clustering based on non-negative matrix factorization
  • Xu, Wei; Liu, Xin; Gong, Yihong
  • Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval - SIGIR '03 https://doi.org/10.1145/860435.860485
conference January 2003
The challenge of information visualization evaluation conference January 2004

Cited By (1)

PaperPoles: Facilitating adaptive visual exploration of scientific publications by citation links journal February 2019

Similar Records

Document Retrieval and Ranking using Similarity Graph Mean Hitting Times
Technical Report · Tue Nov 30 23:00:00 EST 2021 · OSTI ID:1835671

Rapid Exploitation and Analysis of Documents
Technical Report · Thu Dec 01 23:00:00 EST 2011 · OSTI ID:1033748

Search tool plug-in: imploements latent topic feedback
Software · Fri Sep 23 00:00:00 EDT 2011 · OSTI ID:1231509