skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High Performance Multivariate Visual Data Exploration for Extremely Large Data

Abstract

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

Authors:
; ; ; ; ; ; ; ; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
Accelerator& Fusion Research Division; Computational Research Division
OSTI Identifier:
941717
Report Number(s):
LBNL-716E
TRN: US0807486
DOE Contract Number:
DE-AC02-05CH11231
Resource Type:
Conference
Resource Relation:
Conference: SuperComputing 2008, Austin, Texas, USA, Nov.15-21 2008
Country of Publication:
United States
Language:
English
Subject:
43; 97; ACCELERATORS; DATA ANALYSIS; EXPLORATION; LASERS; MANAGEMENT; MINING; PERFORMANCE; SIMULATION; WAKEFIELD ACCELERATORS; data mining, visual data analysis, accelerator modeling, parallel visualization, large data visualization, temporal visualization, temporal data analysis

Citation Formats

Rubel, Oliver, Wu, Kesheng, Childs, Hank, Meredith, Jeremy, Geddes, Cameron G.R., Cormier-Michel, Estelle, Ahern, Sean, Weber, Gunther H., Messmer, Peter, Hagen, Hans, Hamann, Bernd, Bethel, E. Wes, and Prabhat,. High Performance Multivariate Visual Data Exploration for Extremely Large Data. United States: N. p., 2008. Web.
Rubel, Oliver, Wu, Kesheng, Childs, Hank, Meredith, Jeremy, Geddes, Cameron G.R., Cormier-Michel, Estelle, Ahern, Sean, Weber, Gunther H., Messmer, Peter, Hagen, Hans, Hamann, Bernd, Bethel, E. Wes, & Prabhat,. High Performance Multivariate Visual Data Exploration for Extremely Large Data. United States.
Rubel, Oliver, Wu, Kesheng, Childs, Hank, Meredith, Jeremy, Geddes, Cameron G.R., Cormier-Michel, Estelle, Ahern, Sean, Weber, Gunther H., Messmer, Peter, Hagen, Hans, Hamann, Bernd, Bethel, E. Wes, and Prabhat,. Fri . "High Performance Multivariate Visual Data Exploration for Extremely Large Data". United States. doi:. https://www.osti.gov/servlets/purl/941717.
@article{osti_941717,
title = {High Performance Multivariate Visual Data Exploration for Extremely Large Data},
author = {Rubel, Oliver and Wu, Kesheng and Childs, Hank and Meredith, Jeremy and Geddes, Cameron G.R. and Cormier-Michel, Estelle and Ahern, Sean and Weber, Gunther H. and Messmer, Peter and Hagen, Hans and Hamann, Bernd and Bethel, E. Wes and Prabhat,},
abstractNote = {One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Aug 22 00:00:00 EDT 2008},
month = {Fri Aug 22 00:00:00 EDT 2008}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Our work combines and extends techniques from high-performance scientific data management and visualization to enable scientific researchers to gain insight from extremely large, complex, time-varying laser wakefield particle accelerator simulation data. We extend histogram-based parallel coordinates for use in visual information display as well as an interface for guiding and performing data mining operations, which are based upon multi-dimensional and temporal thresholding and data subsetting operations. To achieve very high performance on parallel computing platforms, we leverage FastBit, a state-of-the-art index/query technology, to accelerate data mining and multi-dimensional histogram computation. We show how these techniques are used in practice bymore » scientific researchers to identify, visualize and analyze a particle beam in a large, time-varying dataset.« less
  • Data visualization, as well as data analysis and data analytics, are all an integral part of the scientific process. Collectively, these technologies provide the means to gain insight into data of ever-increasing size and complexity. Over the past two decades, a substantial amount of visualization, analysis, and analytics R&D has focused on the challenges posed by increasing data size and complexity, as well as on the increasing complexity of a rapidly changing computational platform landscape. While some of this research focuses on solely on technologies, such as indexing and searching or novel analysis or visualization algorithms, other R&D projects focusmore » on applying technological advances to specific application problems. Some of the most interesting and productive results occur when these two activities R&D and application are conducted in a collaborative fashion, where application needs drive R&D, and R&D results are immediately applicable to real world problems.« less