skip to main content

Search for: All records

Creators/Authors contains: "Shead, Timothy M."
  1. Toyplot integrates high quality interactive, animated plotting capabilities into web browsers using the Python programming language.
  2. Abstract not provided.
  3. Abstract not provided.
  4. Abstract not provided.
  5. Abstract not provided.
  6. The ThreatView project is based on our prior work with the existing ParaView open-source scientific visualization application. Where ParaView provides a grapical client optimized scientific visualization over the VTK parallel client server architecture, ThreatView provides a client optimized for more generic visual analytics over the same architecture. Because ThreatView is based on the VTK parallel client-server architecture, data sources can reside on remote hosts, and processing and rendering can be performed in parallel. As seen in Fig. 1, ThreatView provides four main methods for visualizing data: Landscape View, which displays a graph using a landscape metaphor where clusters of graphmore » nodes produce "hills" in the landscape; Graph View, which displays a graph using a traditional "ball-and-stick" style; Table View, which displays tabular data in a standard spreadsheet; and Attribute View, which displays a tabular "histogram" of input data - for a selected table column, the Attribute View displays each unique value within the column, and the number of times that value appears in the data. There are two supplemental view types: Text View, which displays tabular data one-record-at-a-time; and the Statistics View, which displays input metadata, such as the number of vertices and edges in a graph, the number of rows in a table, etc.« less
  7. Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems in areas including decision making, risk assessment, social network analysis, intelligence analysis, scholarly research and others. However, as data sizes continue to grow in these areas, scalable processing, modeling, and semantic analysis of text collections becomes essential. In this paper, we present the ParaText text analysis engine, a distributed memory software framework for processing, modeling, and analyzing collections of unstructured text documents. Results on several document collections using hundreds of processors are presented to illustrate the exibility,more » extensibility, and scalability of the the entire process of text modeling from raw data ingestion to application analysis.« less
  8. Automated processing, modeling, and analysis of unstructured text (news documents, web content, journal articles, etc.) is a key task in many data analysis and decision making applications. As data sizes grow, scalability is essential for deep analysis. In many cases, documents are modeled as term or feature vectors and latent semantic analysis (LSA) is used to model latent, or hidden, relationships between documents and terms appearing in those documents. LSA supplies conceptual organization and analysis of document collections by modeling high-dimension feature vectors in many fewer dimensions. While past work on the scalability of LSA modeling has focused on themore » SVD, the goal of our work is to investigate the use of distributed memory architectures for the entire text analysis process, from data ingestion to semantic modeling and analysis. ParaText is a set of software components for distributed processing, modeling, and analysis of unstructured text. The ParaText source code is available under a BSD license, as an integral part of the Titan toolkit. ParaText components are chained-together into data-parallel pipelines that are replicated across processes on distributed-memory architectures. Individual components can be replaced or rewired to explore different computational strategies and implement new functionality. ParaText functionality can be embedded in applications on any platform using the native C++ API, Python, or Java. The ParaText MPI Process provides a 'generic' text analysis pipeline in a command-line executable that can be used for many serial and parallel analysis tasks. ParaText can also be deployed as a web service accessible via a RESTful (HTTP) API. In the web service configuration, any client can access the functionality provided by ParaText using commodity protocols ... from standard web browsers to custom clients written in any language.« less
Switch to Detail View for this search