Rapid Exploitation and Analysis of Documents

Buttler, David J.; Andrzejewski, David; Stevens, Keith D.; Anastasiu, David; Gao, Byron

doi:10.2172/1033748

Rapid Exploitation and Analysis of Documents

Technical Report · Thu Dec 01 19:00:00 EST 2011

DOI:https://doi.org/10.2172/1033748· OSTI ID:1033748

Buttler, David J. ^[1]; Andrzejewski, David ^[1]; Stevens, Keith D. ^[1]; Anastasiu, David ^[1]; Gao, Byron ^[1]

Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Analysts are overwhelmed with information. They have large archives of historical data, both structured and unstructured, and continuous streams of relevant messages and documents that they need to match to current tasks, digest, and incorporate into their analysis. The purpose of the READ project is to develop technologies to make it easier to catalog, classify, and locate relevant information. We approached this task from multiple angles. First, we tackle the issue of processing large quantities of information in reasonable time. Second, we provide mechanisms that allow users to customize their queries based on latent topics exposed from corpus statistics. Third, we assist users in organizing query results, adding localized expert structure over results. Forth, we use word sense disambiguation techniques to increase the precision of matching user generated keyword lists with terms and concepts in the corpus. Fifth, we enhance co-occurrence statistics with latent topic attribution, to aid entity relationship discovery. Finally we quantitatively analyze the quality of three popular latent modeling techniques to examine under which circumstances each is useful.

Research Organization:: Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program

DOE Contract Number:: W-7405-ENG-48; AC52-07NA27344

OSTI ID:: 1033748

Report Number(s):: LLNL--TR-517731

Country of Publication:: United States

Language:: English

Similar Records

VisIRR: A Visual Analytics System for Information Retrieval and Recommendation for Large-Scale Document Data

Journal Article · Tue Jan 30 23:00:00 EST 2018 · ACM Transactions on Knowledge Discovery from Data · OSTI ID:1426558

Search tool plug-in: imploements latent topic feedback

Software · Fri Sep 23 00:00:00 EDT 2011 · OSTI ID:1231509

Search tool plug-in: imploements latent topic feedback

Software · Wed Sep 21 20:00:00 EDT 2011 · OSTI ID:code-1941

Related Subjects

97 MATHEMATICS AND COMPUTING
ACCURACY
ORGANIZING
PROCESSING
SIMULATION
STATISTICS

Rapid Exploitation and Analysis of Documents

Citation Formats

Similar Records

Related Subjects