skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ParaText : scalable text analysis and visualization.

Abstract

Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility, extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.

Authors:
; ;
Publication Date:
Research Org.:
Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1021689
Report Number(s):
SAND2010-4595C
TRN: US201117%%283
DOE Contract Number:  
AC04-94AL85000
Resource Type:
Conference
Resource Relation:
Conference: Proposed for presentation at the SIAM Annual Meeting held July 12-16, 2010 in Pittsburgh, PA.
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; BUSINESS; DECISION MAKING; INGESTION; NETWORK ANALYSIS; PROCESSING; RISK ASSESSMENT; SIMULATION

Citation Formats

Dunlavy, Daniel M, Stanton, Eric T, and Shead, Timothy M. ParaText : scalable text analysis and visualization.. United States: N. p., 2010. Web.
Dunlavy, Daniel M, Stanton, Eric T, & Shead, Timothy M. ParaText : scalable text analysis and visualization.. United States.
Dunlavy, Daniel M, Stanton, Eric T, and Shead, Timothy M. 2010. "ParaText : scalable text analysis and visualization.". United States.
@article{osti_1021689,
title = {ParaText : scalable text analysis and visualization.},
author = {Dunlavy, Daniel M and Stanton, Eric T and Shead, Timothy M},
abstractNote = {Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility, extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.},
doi = {},
url = {https://www.osti.gov/biblio/1021689}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jul 01 00:00:00 EDT 2010},
month = {Thu Jul 01 00:00:00 EDT 2010}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: