skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Scalable Data Management, Analysis, and Visualization (SDAV) Institute

Abstract

As the scale of computation has exploded, the data produced by these simulations has increased in size, complexity, and richness by orders of magnitude, and this trend will continue. Users of scientific computing systems are faced with the daunting task of managing and analyzing their datasets for knowledge discovery, frequently using antiquated tools more appropriate for the teraflop era. While new techniques and tools are available that address these challenges, often application scientists are not aware of these tools, aren't familiar with the tools' use, or the tools are not installed at the appropriate facilities. SDAV deploys, and assists scientists in using, technical solutions addressing challenges in three areas: 1. Data Management – infrastructure that captures the data models used in science codes, efficiently moves, indexes, and compresses this data, enables query of scientific datasets, and provides the underpinnings of in situ data analysis 2. Data Analysis – application-driven, architecture-aware techniques for performing in situ data analysis, filtering, and reduction to optimize downstream I/O and prepare for in-depth post-processing analysis and visualization 3. Data Visualization – exploratory visualization techniques that support understanding ensembles of results, methods of quantifying uncertainty, and identifying and understanding features in multi-scale, multi-physics datasets The teammore » works directly with application scientists to assist them and in the process learns from the scientists where SDAV tools fall short. Technical solutions to any shortcomings are developed to ensure that our tools address and overcome mission-critical challenges in the scientific discovery process. State-of-the-art techniques in software development and quality assurance are applied so that the software developed and deployed meets the high standards needed to ensure the correctness and performance of science codes. In addition to connecting with application teams, close ties to leading compute facilities are important for successful deployment and adoption of SDAV tools. The Institute includes facility partners from NERSC, ANL, and ORNL who are responsible for software installation at their respective site. These partners will also inform SDAV team members of upcoming system architectures, guiding development of SDAV tools to ensure that they will be ready as new systems come online. In addition to one-on-one collaborations between SDAV and science teams, SDAV team members organize tutorials and workshops that aim to help inform the larger community about the tools the Institute makes available, train potential users, and provide opportunities to gather information from other researchers and potential customers. These activities are coordinated with leading conferences (e.g., ACM/IEEE Supercomputing) and DOE computing facility activities (e.g., the ALCF Getting Started Workshop series).« less

Authors:
ORCiD logo [1]
  1. University of California, Davis
Publication Date:
Research Org.:
Univ. of California, Davis, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21). Scientific Discovery through Advanced Computing (SciDAC)
Contributing Org.:
University of California, Davis
OSTI Identifier:
1498620
Report Number(s):
SC0007443-001
5307526958
DOE Contract Number:  
SC0007443
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
Data analysis, data management, visualization, simulations

Citation Formats

Ma, Kwan-Liu. Scalable Data Management, Analysis, and Visualization (SDAV) Institute. United States: N. p., 2019. Web. doi:10.2172/1498620.
Ma, Kwan-Liu. Scalable Data Management, Analysis, and Visualization (SDAV) Institute. United States. doi:10.2172/1498620.
Ma, Kwan-Liu. Fri . "Scalable Data Management, Analysis, and Visualization (SDAV) Institute". United States. doi:10.2172/1498620. https://www.osti.gov/servlets/purl/1498620.
@article{osti_1498620,
title = {Scalable Data Management, Analysis, and Visualization (SDAV) Institute},
author = {Ma, Kwan-Liu},
abstractNote = {As the scale of computation has exploded, the data produced by these simulations has increased in size, complexity, and richness by orders of magnitude, and this trend will continue. Users of scientific computing systems are faced with the daunting task of managing and analyzing their datasets for knowledge discovery, frequently using antiquated tools more appropriate for the teraflop era. While new techniques and tools are available that address these challenges, often application scientists are not aware of these tools, aren't familiar with the tools' use, or the tools are not installed at the appropriate facilities. SDAV deploys, and assists scientists in using, technical solutions addressing challenges in three areas: 1. Data Management – infrastructure that captures the data models used in science codes, efficiently moves, indexes, and compresses this data, enables query of scientific datasets, and provides the underpinnings of in situ data analysis 2. Data Analysis – application-driven, architecture-aware techniques for performing in situ data analysis, filtering, and reduction to optimize downstream I/O and prepare for in-depth post-processing analysis and visualization 3. Data Visualization – exploratory visualization techniques that support understanding ensembles of results, methods of quantifying uncertainty, and identifying and understanding features in multi-scale, multi-physics datasets The team works directly with application scientists to assist them and in the process learns from the scientists where SDAV tools fall short. Technical solutions to any shortcomings are developed to ensure that our tools address and overcome mission-critical challenges in the scientific discovery process. State-of-the-art techniques in software development and quality assurance are applied so that the software developed and deployed meets the high standards needed to ensure the correctness and performance of science codes. In addition to connecting with application teams, close ties to leading compute facilities are important for successful deployment and adoption of SDAV tools. The Institute includes facility partners from NERSC, ANL, and ORNL who are responsible for software installation at their respective site. These partners will also inform SDAV team members of upcoming system architectures, guiding development of SDAV tools to ensure that they will be ready as new systems come online. In addition to one-on-one collaborations between SDAV and science teams, SDAV team members organize tutorials and workshops that aim to help inform the larger community about the tools the Institute makes available, train potential users, and provide opportunities to gather information from other researchers and potential customers. These activities are coordinated with leading conferences (e.g., ACM/IEEE Supercomputing) and DOE computing facility activities (e.g., the ALCF Getting Started Workshop series).},
doi = {10.2172/1498620},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {3}
}