skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Real-time data-intensive computing

Abstract

Today users visit synchrotrons as sources of understanding and discovery—not as sources of just light, and not as sources of data. To achieve this, the synchrotron facilities frequently provide not just light but often the entire end station and increasingly, advanced computational facilities that can reduce terabytes of data into a form that can reveal a new key insight. The Advanced Light Source (ALS) has partnered with high performance computing, fast networking, and applied mathematics groups to create a “super-facility”, giving users simultaneous access to the experimental, computational, and algorithmic resources to make this possible. This combination forms an efficient closed loop, where data—despite its high rate and volume—is transferred and processed immediately and automatically on appropriate computing resources, and results are extracted, visualized, and presented to users or to the experimental control system, both to provide immediate insight and to guide decisions about subsequent experiments during beamtime. We will describe our work at the ALS ptychography, scattering, micro-diffraction, and micro-tomography beamlines.

Authors:
; ; ; ; ; ;  [1]; ; ; ; ; ; ;  [2]; ;  [3]; ;  [4]; ;  [5] more »; « less
  1. Advanced Light Source, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States)
  2. Computational Research Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 (United States)
  3. National Energy Research Scientific Computing Center, Berkeley, CA 94720 (United States)
  4. Energy Sciences Network, Berkeley, CA 94720 (United States)
  5. Uppsala University, Uppsala (Sweden)
Publication Date:
OSTI Identifier:
22608429
Resource Type:
Journal Article
Resource Relation:
Journal Name: AIP Conference Proceedings; Journal Volume: 1741; Journal Issue: 1; Conference: SRI2015: 12. international conference on synchrotron radiation instrumentation, New York, NY (United States), 6-10 Jul 2015; Other Information: (c) 2016 Author(s); Country of input: International Atomic Energy Agency (IAEA)
Country of Publication:
United States
Language:
English
Subject:
43 PARTICLE ACCELERATORS; ADVANCED LIGHT SOURCE; CONTROL SYSTEMS; DATA PROCESSING; DIFFRACTION; PERFORMANCE; RESOURCES; SYNCHROTRONS; TOMOGRAPHY

Citation Formats

Parkinson, Dilworth Y., E-mail: dyparkinson@lbl.gov, Chen, Xian, Hexemer, Alexander, MacDowell, Alastair A., Padmore, Howard A., Shapiro, David, Tamura, Nobumichi, Beattie, Keith, Krishnan, Harinarayan, Patton, Simon J., Perciano, Talita, Stromsness, Rune, Tull, Craig E., Ushizima, Daniela, Correa, Joaquin, Deslippe, Jack R., Dart, Eli, Tierney, Brian L., Daurer, Benedikt J., Maia, Filipe R. N. C., and and others. Real-time data-intensive computing. United States: N. p., 2016. Web. doi:10.1063/1.4952921.
Parkinson, Dilworth Y., E-mail: dyparkinson@lbl.gov, Chen, Xian, Hexemer, Alexander, MacDowell, Alastair A., Padmore, Howard A., Shapiro, David, Tamura, Nobumichi, Beattie, Keith, Krishnan, Harinarayan, Patton, Simon J., Perciano, Talita, Stromsness, Rune, Tull, Craig E., Ushizima, Daniela, Correa, Joaquin, Deslippe, Jack R., Dart, Eli, Tierney, Brian L., Daurer, Benedikt J., Maia, Filipe R. N. C., & and others. Real-time data-intensive computing. United States. doi:10.1063/1.4952921.
Parkinson, Dilworth Y., E-mail: dyparkinson@lbl.gov, Chen, Xian, Hexemer, Alexander, MacDowell, Alastair A., Padmore, Howard A., Shapiro, David, Tamura, Nobumichi, Beattie, Keith, Krishnan, Harinarayan, Patton, Simon J., Perciano, Talita, Stromsness, Rune, Tull, Craig E., Ushizima, Daniela, Correa, Joaquin, Deslippe, Jack R., Dart, Eli, Tierney, Brian L., Daurer, Benedikt J., Maia, Filipe R. N. C., and and others. 2016. "Real-time data-intensive computing". United States. doi:10.1063/1.4952921.
@article{osti_22608429,
title = {Real-time data-intensive computing},
author = {Parkinson, Dilworth Y., E-mail: dyparkinson@lbl.gov and Chen, Xian and Hexemer, Alexander and MacDowell, Alastair A. and Padmore, Howard A. and Shapiro, David and Tamura, Nobumichi and Beattie, Keith and Krishnan, Harinarayan and Patton, Simon J. and Perciano, Talita and Stromsness, Rune and Tull, Craig E. and Ushizima, Daniela and Correa, Joaquin and Deslippe, Jack R. and Dart, Eli and Tierney, Brian L. and Daurer, Benedikt J. and Maia, Filipe R. N. C. and and others},
abstractNote = {Today users visit synchrotrons as sources of understanding and discovery—not as sources of just light, and not as sources of data. To achieve this, the synchrotron facilities frequently provide not just light but often the entire end station and increasingly, advanced computational facilities that can reduce terabytes of data into a form that can reveal a new key insight. The Advanced Light Source (ALS) has partnered with high performance computing, fast networking, and applied mathematics groups to create a “super-facility”, giving users simultaneous access to the experimental, computational, and algorithmic resources to make this possible. This combination forms an efficient closed loop, where data—despite its high rate and volume—is transferred and processed immediately and automatically on appropriate computing resources, and results are extracted, visualized, and presented to users or to the experimental control system, both to provide immediate insight and to guide decisions about subsequent experiments during beamtime. We will describe our work at the ALS ptychography, scattering, micro-diffraction, and micro-tomography beamlines.},
doi = {10.1063/1.4952921},
journal = {AIP Conference Proceedings},
number = 1,
volume = 1741,
place = {United States},
year = 2016,
month = 7
}
  • Finding a different way is the goal of the Data-Intensive Computing for Complex Biological Systems (Biopilot) project—a joint research effort between the Pacific Northwest National Laboratory (PNNL) and Oak Ridge National Laboratory funded by the U.S. Department of Energy’s Office of Advanced Scientific Computing Research. The two national laboratories, both of whom are world leaders in computing and computational sciences, are teaming to support areas of biological research in urgent need of data-intensive computing capabilities.
  • High energy physics experiments periodically reprocess data, in order to take advantage of improved understanding of the detector and the data processing code. Between February and May 2007, the DZero experiment has reprocessed a substantial fraction of its dataset. This consists of half a billion events, corresponding to about 100 TB of data, organized in 300,000 files. The activity utilized resources from sites around the world, including a dozen sites participating to the Open Science Grid consortium (OSG). About 1,500 jobs were run every day across the OSG, consuming and producing hundreds of Gigabytes of data. Access to OSG computingmore » and storage resources was coordinated by the SAM-Grid system. This system organized job access to a complex topology of data queues and job scheduling to clusters, using a SAM-Grid to OSG job forwarding infrastructure. For the first time in the lifetime of the experiment, a data intensive production activity was managed on a general purpose grid, such as OSG. This paper describes the implications of using OSG, where all resources are granted following an opportunistic model, the challenges of operating a data intensive activity over such large computing infrastructure, and the lessons learned throughout the project.« less
  • Biological breakthroughs critical to solving society’s most challenging problems require new and innovative tools and a “different way” to analyze the enormous amounts of data being generated. This article for the Breakthroughs magazine focuses on the Data-Intensive Computing for Complex Biological Systems (Biopilot) project—a joint research effort between the Pacific Northwest National Laboratory (PNNL) and Oak Ridge National Laboratory funded by the U.S. Department of Energy’s Office of Advanced Scientific Computing Research. The two national laboratories, both of whom are world leaders in computing and computational sciences, are teaming to support areas of biological research in urgent need of data-intensivemore » computing capabilities.« less
  • Editorial for IEEE Computer Special edition on Data Intensive Computing
  • The advancement in computing technology has enabled scientists to collect massive amounts of data, taking us a step closer to solving complex problems such as global climate change and uncovering the secrets hidden in genes. The exponential growth in the amount of data collected from experiments, measurements and observations, however, has created an urgent technical challenge. A talented group of computational scientists are leading the effort at PNNL to tackle the challenge through a major initiative on high-performance and data-intensive computing. PNNL’s data-intensive computing initiative will attempt to accelerate the creation of computational solutions to support the study of problemsmore » of national scope involving large amounts of data from very complex systems.« less