skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ASCR Workshop on In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources

Abstract

In January 2019, the U.S. Department of Energy, Office of Science program in Advanced Scientific Computing Research, convened a workshop to identify priority research directions for in situ data management (ISDM). The workshop defined ISDM as the practices, capabilities, and procedures to control the organization of data and enable the coordination and communication among heterogeneous tasks, executing simultaneously in a high-performance computing system, cooperating toward a common objective. The workshop revealed two primary, interdependent motivations for processing and managing data in situ. The first motivation is that the in situ methodology enables scientific discovery from a broad range of data sources over a wide scale of computing platforms: leadership-class systems, clusters, clouds, workstations, and embedded devices at the edge. The successful development of ISDM capabilities will benefit real-time decision-making, design optimization, and data-driven scientific discovery. The second motivation is the need to decrease data volumes. ISDM can make critical contributions to managing large data volumes from computations and experiments to minimize data movement, save storage space, and boost resource efficiency, often while simultaneously increasing scientific precision. A fundamental finding of this workshop is that the methodologies used to manage data among a variety of tasks in situ can be usedmore » to facilitate scientific discovery from many different data sources—simulation, experiment, and sensors, for example—and that being able to do so at numerous computing scales will benefit real-time decision-making, design optimization, and data-driven scientific discovery across the Office of Science mission space. Applications wanting to use the in situ capabilities include those where data analysis feeds back to the simulation, decisions are made autonomously, big data or machine learning is among the tasks to be coordinated, and computations need to be completed in real time. The workshop identified six priority research directions that highlight the components and capabilities needed for ISDM to be successful for the wide variety of applications discussed: making ISDM capabilities more pervasive, controllable, composable, and transparent, with a focus on greater coordination with the software stack and a diversity of fundamentally new data algorithms.« less

Authors:
 [1];  [2];  [3];  [4];  [3];  [5];  [6];  [7]
  1. Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
  3. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  4. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  5. Brookhaven National Lab. (BNL), Upton, NY (United States)
  6. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  7. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
USDOE Office of Science (SC) (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1493245
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English

Citation Formats

Peterka, Tom, Bard, Deborah, Bennett, Janine, Bethel, E. Wes, Oldfield, Ron, Pouchard, Line, Sweeney, Christine, and Wolf, Matthew. ASCR Workshop on In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources. United States: N. p., 2019. Web. doi:10.2172/1493245.
Peterka, Tom, Bard, Deborah, Bennett, Janine, Bethel, E. Wes, Oldfield, Ron, Pouchard, Line, Sweeney, Christine, & Wolf, Matthew. ASCR Workshop on In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources. United States. doi:10.2172/1493245.
Peterka, Tom, Bard, Deborah, Bennett, Janine, Bethel, E. Wes, Oldfield, Ron, Pouchard, Line, Sweeney, Christine, and Wolf, Matthew. Mon . "ASCR Workshop on In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources". United States. doi:10.2172/1493245. https://www.osti.gov/servlets/purl/1493245.
@article{osti_1493245,
title = {ASCR Workshop on In Situ Data Management: Enabling Scientific Discovery from Diverse Data Sources},
author = {Peterka, Tom and Bard, Deborah and Bennett, Janine and Bethel, E. Wes and Oldfield, Ron and Pouchard, Line and Sweeney, Christine and Wolf, Matthew},
abstractNote = {In January 2019, the U.S. Department of Energy, Office of Science program in Advanced Scientific Computing Research, convened a workshop to identify priority research directions for in situ data management (ISDM). The workshop defined ISDM as the practices, capabilities, and procedures to control the organization of data and enable the coordination and communication among heterogeneous tasks, executing simultaneously in a high-performance computing system, cooperating toward a common objective. The workshop revealed two primary, interdependent motivations for processing and managing data in situ. The first motivation is that the in situ methodology enables scientific discovery from a broad range of data sources over a wide scale of computing platforms: leadership-class systems, clusters, clouds, workstations, and embedded devices at the edge. The successful development of ISDM capabilities will benefit real-time decision-making, design optimization, and data-driven scientific discovery. The second motivation is the need to decrease data volumes. ISDM can make critical contributions to managing large data volumes from computations and experiments to minimize data movement, save storage space, and boost resource efficiency, often while simultaneously increasing scientific precision. A fundamental finding of this workshop is that the methodologies used to manage data among a variety of tasks in situ can be used to facilitate scientific discovery from many different data sources—simulation, experiment, and sensors, for example—and that being able to do so at numerous computing scales will benefit real-time decision-making, design optimization, and data-driven scientific discovery across the Office of Science mission space. Applications wanting to use the in situ capabilities include those where data analysis feeds back to the simulation, decisions are made autonomously, big data or machine learning is among the tasks to be coordinated, and computations need to be completed in real time. The workshop identified six priority research directions that highlight the components and capabilities needed for ISDM to be successful for the wide variety of applications discussed: making ISDM capabilities more pervasive, controllable, composable, and transparent, with a focus on greater coordination with the software stack and a diversity of fundamentally new data algorithms.},
doi = {10.2172/1493245},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {2}
}