skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Scalable Visual Analytics of Massive Textual Datasets

Abstract

This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
908953
Report Number(s):
PNNL-SA-52302
TRN: US200722%%830
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: IPDPS 2007. IEEE International Parallel and Distributed Processing Symposium, 26-30 March 2007, Long Beach, CA, USA, 10 pages
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; INFORMATION SYSTEMS; DATA PROCESSING; DOCUMENT TYPES; PARALLEL PROCESSING; Visual Analytics; parallel processing

Citation Formats

Krishnan, Manoj Kumar, Bohn, Shawn J., Cowley, Wendy E., Crow, Vernon L., and Nieplocha, Jarek. Scalable Visual Analytics of Massive Textual Datasets. United States: N. p., 2007. Web. doi:10.1109/IPDPS.2007.370232.
Krishnan, Manoj Kumar, Bohn, Shawn J., Cowley, Wendy E., Crow, Vernon L., & Nieplocha, Jarek. Scalable Visual Analytics of Massive Textual Datasets. United States. doi:10.1109/IPDPS.2007.370232.
Krishnan, Manoj Kumar, Bohn, Shawn J., Cowley, Wendy E., Crow, Vernon L., and Nieplocha, Jarek. Sun . "Scalable Visual Analytics of Massive Textual Datasets". United States. doi:10.1109/IPDPS.2007.370232.
@article{osti_908953,
title = {Scalable Visual Analytics of Massive Textual Datasets},
author = {Krishnan, Manoj Kumar and Bohn, Shawn J. and Cowley, Wendy E. and Crow, Vernon L. and Nieplocha, Jarek},
abstractNote = {This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.},
doi = {10.1109/IPDPS.2007.370232},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Apr 01 00:00:00 EDT 2007},
month = {Sun Apr 01 00:00:00 EDT 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: