skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data and Communications in Basic Energy Sciences: Creating a Pathway for Scientific Discovery

Program Document ·
DOI:https://doi.org/10.2172/1291136· OSTI ID:1291136
 [1];  [2]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

This report is based on the Department of Energy (DOE) Workshop on “Data and Communications in Basic Energy Sciences: Creating a Pathway for Scientific Discovery” that was held at the Bethesda Marriott in Maryland on October 24-25, 2011. The workshop brought together leading researchers from the Basic Energy Sciences (BES) facilities and Advanced Scientific Computing Research (ASCR). The workshop was co-sponsored by these two Offices to identify opportunities and needs for data analysis, ownership, storage, mining, provenance and data transfer at light sources, neutron sources, microscopy centers and other facilities. Their charge was to identify current and anticipated issues in the acquisition, analysis, communication and storage of experimental data that could impact the progress of scientific discovery, ascertain what knowledge, methods and tools are needed to mitigate present and projected shortcomings and to create the foundation for information exchanges and collaboration between ASCR and BES supported researchers and facilities. The workshop was organized in the context of the impending data tsunami that will be produced by DOE’s BES facilities. Current facilities, like SLAC National Accelerator Laboratory’s Linac Coherent Light Source, can produce up to 18 terabytes (TB) per day, while upgraded detectors at Lawrence Berkeley National Laboratory’s Advanced Light Source will generate ~10TB per hour. The expectation is that these rates will increase by over an order of magnitude in the coming decade. The urgency to develop new strategies and methods in order to stay ahead of this deluge and extract the most science from these facilities was recognized by all. The four focus areas addressed in this workshop were: Workflow Management - Experiment to Science: Identifying and managing the data path from experiment to publication. Theory and Algorithms: Recognizing the need for new tools for computation at scale, supporting large data sets and realistic theoretical models. Visualization and Analysis: Supporting near-real-time feedback for experiment optimization and new ways to extract and communicate critical information from large data sets. Data Processing and Management: Outlining needs in computational and communication approaches and infrastructure needed to handle unprecedented data volume and information content. It should be noted that almost all participants recognized that there were unlikely to be any turn-key solutions available due to the unique, diverse nature of the BES community, where research at adjacent beamlines at a given light source facility often span everything from biology to materials science to chemistry using scattering, imaging and/or spectroscopy. However, it was also noted that advances supported by other programs in data research, methodologies, and tool development could be implemented on reasonable time scales with modest effort. Adapting available standard file formats, robust workflows, and in-situ analysis tools for user facility needs could pay long-term dividends. Workshop participants assessed current requirements as well as future challenges and made the following recommendations in order to achieve the ultimate goal of enabling transformative science in current and future BES facilities: Theory and analysis components should be integrated seamlessly within experimental workflow. Develop new algorithms for data analysis based on common data formats and toolsets. Move analysis closer to experiment. Move the analysis closer to the experiment to enable real-time (in-situ) streaming capabilities, live visualization of the experiment and an increase of the overall experimental efficiency. Match data management access and capabilities with advancements in detectors and sources. Remove bottlenecks, provide interoperability across different facilities/beamlines and apply forefront mathematical techniques to more efficiently extract science from the experiments. This workshop report examines and reviews the status of several BES facilities and highlights the successes and shortcomings of the current data and communication pathways for scientific discovery. It then ascertains what methods and tools are needed to mitigate present and projected data bottlenecks to science over the next 10 years. The goal of this report is to create the foundation for information exchanges and collaborations among ASCR and BES supported researchers, the BES scientific user facilities, and ASCR computing and networking facilities. To jumpstart these activities, there was a strong desire to see a joint effort between ASCR and BES along the lines of the highly successful Scientific Discovery through Advanced Computing (SciDAC) program in which integrated teams of engineers, scientists and computer scientists were engaged to tackle a complete end-to-end workflow solution at one or more beamlines, to ascertain what challenges will need to be addressed in order to handle future increases in data

Research Organization:
USDOE Office of Science (SC) (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OSTI ID:
1291136
Country of Publication:
United States
Language:
English