ParaText : scalable text analysis and visualization.

Dunlavy, Daniel M; Stanton, Eric T; Shead, Timothy M

Title: ParaText : scalable text analysis and visualization.

Conference · Thu Jul 01 00:00:00 EDT 2010

OSTI ID:1021689

Dunlavy, Daniel M; Stanton, Eric T; Shead, Timothy M

Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility, extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC04-94AL85000

OSTI ID:: 1021689

Report Number(s):: SAND2010-4595C; TRN: US201117%%283

Resource Relation:: Conference: Proposed for presentation at the SIAM Annual Meeting held July 12-16, 2010 in Pittsburgh, PA.

Country of Publication:: United States

Language:: English

Similar Records

ParaText : scalable text modeling and analysis.

Conference · Tue Jun 01 00:00:00 EDT 2010 · OSTI ID:1021689

Dunlavy, Daniel M; Stanton, Eric T; Shead, Timothy M

TexTonic: Interactive Visualization for Exploration and discovery of Very Large Text Collections

Journal Article · Mon Jul 01 00:00:00 EDT 2019 · Information Visualization · OSTI ID:1021689

Paul, Celeste; Chang, Jessica; Endert, Alexander; +6 more

ParaText : scalable solutions for processing and searching very large document collections : final LDRD report.

Technical Report · Wed Sep 01 00:00:00 EDT 2010 · OSTI ID:1021689

Crossno, Patricia Joyce; Dunlavy, Daniel M; Stanton, Eric T; +1 more

Related Subjects

99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
BUSINESS
DECISION MAKING
INGESTION
NETWORK ANALYSIS
PROCESSING
RISK ASSESSMENT
SIMULATION

Title: ParaText : scalable text analysis and visualization.

Citation Formats

Similar Records

Related Subjects