skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Boosting Big National Lab Data

Journal Article · · Datanami
OSTI ID:1072861
 [1]
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Introduction: Big data. Love it or hate it, solving the world’s most intractable problems requires the ability to make sense of huge and complex sets of data and do it quickly. Speeding up the process – from hours to minutes or from weeks to days – is key to our success. One major source of such big data are physical experiments. As many will know, these physical experiments are commonly used to solve challenges in fields such as energy security, manufacturing, medicine, pharmacology, environmental protection and national security. Experiments use different instruments and sensor types to research for example the validity of new drugs, the base cause for diseases, more efficient energy sources, new materials for every day goods, effective methods for environmental cleanup, the optimal ingredients composition for chocolate or determine how to preserve valuable antics. This is done by experimentally determining the structure, properties and processes that govern biological systems, chemical processes and materials. The speed and quality at which we can acquire new insights from experiments directly influences the rate of scientific progress, industrial innovation and competitiveness. And gaining new groundbreaking insights, faster, is key to the economic success of our nations. Recent years have seen incredible advances in sensor technologies, from house size detector systems in large experiments such as the Large Hadron Collider and the ‘Eye of Gaia’ billion pixel camera detector to high throughput genome sequencing. These developments have led to an exponential increase in data volumes, rates and variety produced by instruments used for experimental work. This increase is coinciding with a need to analyze the experimental results at the time they are collected. This speed is required to optimize the data taking and quality, and also to enable new adaptive experiments, where the sample is manipulated as it is observed, e.g. a substance is injected into a tissue sample and the gradual effect is observed as more of the substance is injected, providing better insights into the natural processes that are occurring, as well as result driven sampling adjustment to capture particularly interesting features --- as they emerge. The Department of Energy’s Pacific Northwest National Laboratory (PNNL) is recognized for it’s expertise in the development of new measurement techniques and their application to challenges of national importance. So it was obvious to us to address the need for in-situ analysis of large scale experimental data. We have a wide range of experimental instruments on site, in facilities such as DOE’s national scientific user facility, the William R. Wiley Environmental Molecular Sciences Laboratory (EMSL). Commonly, scientists would create an individual analysis pipeline for each of those instruments; but even the same type of instrument would not necessarily share the same analysis tools. With the rapid increase of data volumes and rates we were facing two key challenges: how to bring a wider set of capabilities to bear to achieve in-situ analysis, and how to do so across a wide range of heterogeneous instruments at affordable costs and in a reasonable timeframe. We decided to take an unconventional approach to the problem, rather than developing customized, one-off solutions for specific instruments we wanted to explore if a more common solution could be found that would go beyond shared, basic infrastructures such as data movement and workflow engines.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1072861
Report Number(s):
PNNL-SA-93842
Journal Information:
Datanami, Related Information: http://www.datanami.com/2013/02/21/boosting_big_national_lab_data/
Country of Publication:
United States
Language:
English

Similar Records

Related Subjects