skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Enabling distributed petascale science.

Abstract

Petascale science is an end-to-end endeavour, involving not only the creation of massive datasets at supercomputers or experimental facilities, but the subsequent analysis of that data by a user community that may be distributed across many laboratories and universities. The new SciDAC Center for Enabling Distributed Petascale Science (CEDPS) is developing tools to support this end-to-end process. These tools include data placement services for the reliable, high-performance, secure, and policy-driven placement of data within a distributed science environment; tools and techniques for the construction, operation, and provisioning of scalable science services; and tools for the detection and diagnosis of failures in end-to-end data placement and distributed application hosting configurations. In each area, we build on a strong base of existing technology and have made useful progress in the first year of the project. For example, we have recently achieved order-of-magnitude improvements in transfer times (for lots of small files) and implemented asynchronous data staging capabilities; demonstrated dynamic deployment of complex application stacks for the STAR experiment; and designed and deployed end-to-end troubleshooting services. We look forward to working with SciDAC application and technology projects to realize the promise of petascale science.

Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »; ; ; ; ; ; ; ; « less
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
973002
Report Number(s):
ANL/MCS/CP-59797
TRN: US201005%%399
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Journal Name: J. Phys.: Conf. Ser.; Journal Volume: 78; Journal Issue: 2007; Conference: Scientific Discovery through Advanced Computing (SciDAC 2007); Jun. 24, 2007 - Jun. 26, 2007; Boston, MA
Country of Publication:
United States
Language:
ENGLISH
Subject:
97 MATHEMATICAL METHODS AND COMPUTING; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; SUPERCOMPUTERS; SIZE; DATA ANALYSIS; PARALLEL PROCESSING; DATA TRANSMISSION; PERFORMANCE

Citation Formats

Baranovski, A., Bharathi, S., Bresnahan, J., chervenak, A., Foster, I., Fraser, D., Freeman, T., Gunter, D., Jackson, K., Keahey, K., Kesselman, C., Konerding, D. E., Leroy, N., Link, M., Livny, M., Miller, N., Miller, R., Oleynik, G., Pearlman, L., Schopf, J. M., Schuler, R., Tierney, B., Mathematics and Computer Science, FNL, Univ. of Southern California, Univ. of Chicago, LBNL, and Univ. of Wisconsin. Enabling distributed petascale science.. United States: N. p., 2007. Web. doi:10.1088/1742-6596/78/1/012020.
Baranovski, A., Bharathi, S., Bresnahan, J., chervenak, A., Foster, I., Fraser, D., Freeman, T., Gunter, D., Jackson, K., Keahey, K., Kesselman, C., Konerding, D. E., Leroy, N., Link, M., Livny, M., Miller, N., Miller, R., Oleynik, G., Pearlman, L., Schopf, J. M., Schuler, R., Tierney, B., Mathematics and Computer Science, FNL, Univ. of Southern California, Univ. of Chicago, LBNL, & Univ. of Wisconsin. Enabling distributed petascale science.. United States. doi:10.1088/1742-6596/78/1/012020.
Baranovski, A., Bharathi, S., Bresnahan, J., chervenak, A., Foster, I., Fraser, D., Freeman, T., Gunter, D., Jackson, K., Keahey, K., Kesselman, C., Konerding, D. E., Leroy, N., Link, M., Livny, M., Miller, N., Miller, R., Oleynik, G., Pearlman, L., Schopf, J. M., Schuler, R., Tierney, B., Mathematics and Computer Science, FNL, Univ. of Southern California, Univ. of Chicago, LBNL, and Univ. of Wisconsin. Mon . "Enabling distributed petascale science.". United States. doi:10.1088/1742-6596/78/1/012020.
@article{osti_973002,
title = {Enabling distributed petascale science.},
author = {Baranovski, A. and Bharathi, S. and Bresnahan, J. and chervenak, A. and Foster, I. and Fraser, D. and Freeman, T. and Gunter, D. and Jackson, K. and Keahey, K. and Kesselman, C. and Konerding, D. E. and Leroy, N. and Link, M. and Livny, M. and Miller, N. and Miller, R. and Oleynik, G. and Pearlman, L. and Schopf, J. M. and Schuler, R. and Tierney, B. and Mathematics and Computer Science and FNL and Univ. of Southern California and Univ. of Chicago and LBNL and Univ. of Wisconsin},
abstractNote = {Petascale science is an end-to-end endeavour, involving not only the creation of massive datasets at supercomputers or experimental facilities, but the subsequent analysis of that data by a user community that may be distributed across many laboratories and universities. The new SciDAC Center for Enabling Distributed Petascale Science (CEDPS) is developing tools to support this end-to-end process. These tools include data placement services for the reliable, high-performance, secure, and policy-driven placement of data within a distributed science environment; tools and techniques for the construction, operation, and provisioning of scalable science services; and tools for the detection and diagnosis of failures in end-to-end data placement and distributed application hosting configurations. In each area, we build on a strong base of existing technology and have made useful progress in the first year of the project. For example, we have recently achieved order-of-magnitude improvements in transfer times (for lots of small files) and implemented asynchronous data staging capabilities; demonstrated dynamic deployment of complex application stacks for the STAR experiment; and designed and deployed end-to-end troubleshooting services. We look forward to working with SciDAC application and technology projects to realize the promise of petascale science.},
doi = {10.1088/1742-6596/78/1/012020},
journal = {J. Phys.: Conf. Ser.},
number = 2007,
volume = 78,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The Center for Enabling Distributed Petascale Science is developing serviced to enable researchers to manage large, distributed datasets. The center projects focus on three areas: tools for reliable placement of data, issues involving failure detection and failure diagnosis in distributed systems, and scalable services that process requests to access data.
  • Abstract not provided.
  • The SciDAC2 accelerator project at SLAC aims to simulate an entire three-cryomodule radio frequency (RF) unit of the International Linear Collider (ILC) main Linac. Petascale computing resources supported by advances in Applied Mathematics (AM) and Computer Science (CS) and INCITE Program are essential to enable such very large-scale electromagnetic accelerator simulations required by the ILC Global Design Effort. This poster presents the recent advances and achievements in the areas of CS/AM through collaborations.
  • Over the past decade, the trajectory to the petascale has been built on increased complexity and scale of the underlying parallel architectures. Meanwhile, software de- velopers have struggled to provide tools that maintain the productivity of computational science teams using these new systems. In this regard, Global Address Space (GAS) programming models provide a straightforward and easy to use addressing model, which can lead to improved produc- tivity. However, the scalability of GAS depends directly on the design and implementation of the runtime system on the target petascale distributed-memory architecture. In this paper, we describe the design, implementation, and optimizationmore » of the Aggregate Remote Memory Copy Interface (ARMCI) runtime library on the Cray XT5 2.3 PetaFLOPs computer at Oak Ridge National Laboratory. We optimized our implementation with the flow intimation technique that we have introduced in this paper. Our optimized ARMCI implementation improves scalability of both the Global Arrays (GA) programming model and a real-world chemistry application NWChem from small jobs up through 180,000 cores.« less