skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Building an infrastructure for urgent computing.

Abstract

Large-scale scientific computing is playing an ever-increasing role in critical decision-making and dynamic, event-driven systems. While some computation can simply wait in a job queue until resources become available, key decisions concerning life-threatening situations must be made quickly. A computation to predict the flow of airborne contaminants from a ruptured railcar must guide evacuation rapidly, before the results are meaningless. Although not as urgent, on-demand computing is often required to leverage a scientific opportunity. For example, a burst of data from seismometers could trigger urgent computation that then redirects instruments to focus data collection on specific regions, before the opportunity is lost. This paper describes the challenges of building an infrastructure to support urgent computing. We focus on three areas: the needs and requirements of an urgent computing system, a prototype urgent computing system called SPRUCE currently deployed on the TeraGrid, and future technologies and directions for urgent computing. Since the days of the first supercomputers, computation has been playing an everincreasing role in science.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
982614
Report Number(s):
ANL/MCS/CP-59363
TRN: US201015%%1224
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: International Advanced Research Workshop on High Performance Computing and Grids.; Jul. 3, 2007 - Jul. 6, 2007; Cetraro, Italy
Country of Publication:
United States
Language:
ENGLISH
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; BUILDINGS; DATA; DECISION MAKING; EVACUATION; GRIDS; PERFORMANCE; QUEUES; RESOURCES; SPRUCES; SUPERCOMPUTERS; SUPPORTS

Citation Formats

Beckman, P., Beschastnikh, I., Nadella, S., Trebon, N., Mathematics and Computer Science, Univ. of Chicago, and Univ. of Washington. Building an infrastructure for urgent computing.. United States: N. p., 2007. Web.
Beckman, P., Beschastnikh, I., Nadella, S., Trebon, N., Mathematics and Computer Science, Univ. of Chicago, & Univ. of Washington. Building an infrastructure for urgent computing.. United States.
Beckman, P., Beschastnikh, I., Nadella, S., Trebon, N., Mathematics and Computer Science, Univ. of Chicago, and Univ. of Washington. Mon . "Building an infrastructure for urgent computing.". United States. doi:.
@article{osti_982614,
title = {Building an infrastructure for urgent computing.},
author = {Beckman, P. and Beschastnikh, I. and Nadella, S. and Trebon, N. and Mathematics and Computer Science and Univ. of Chicago and Univ. of Washington},
abstractNote = {Large-scale scientific computing is playing an ever-increasing role in critical decision-making and dynamic, event-driven systems. While some computation can simply wait in a job queue until resources become available, key decisions concerning life-threatening situations must be made quickly. A computation to predict the flow of airborne contaminants from a ruptured railcar must guide evacuation rapidly, before the results are meaningless. Although not as urgent, on-demand computing is often required to leverage a scientific opportunity. For example, a burst of data from seismometers could trigger urgent computation that then redirects instruments to focus data collection on specific regions, before the opportunity is lost. This paper describes the challenges of building an infrastructure to support urgent computing. We focus on three areas: the needs and requirements of an urgent computing system, a prototype urgent computing system called SPRUCE currently deployed on the TeraGrid, and future technologies and directions for urgent computing. Since the days of the first supercomputers, computation has been playing an everincreasing role in science.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The solution of Grand Challenge Problems will require computations that are too large to fit in the memories of even the largest machines. Inevitably, new designs of I/O systems will be necessary to support them. This report describes the work in investigating I/O subsystems for massively parallel computers. Specifically, the authors investigated out-of-core algorithms for common scientific calculations present several theoretical results. They also describe several approaches to parallel I/O, including partitioned secondary storage and choreographed I/O, and the implications of each to massively parallel computing.
  • The Federal HPCC Program is a 10-agency billion dollar per year effort to accelerate the development of high performance computing and communications technologies and the diffusion of these technologies to improve U.S. competitiveness and the well-being of citizens. The panelists will describe accomplishments and plans in: (1) developing scalable computing systems and associated software that scales to sustained teraflops (trillions of floating point operations per second) performance; (2) developing gigabit (billions of bits) per second networking technologies; (3) broadening network connectivity to the research and education communities; (4) demonstrating prototype solutions to ``Grand Challenge`` applications; (5) developing ``National Challenge`` applicationsmore » in areas such as education and lifelong learning, health care, manufacturing processes and products, and public access to government information, as well as crisis and emergency management, electronic commerce, energy management, and environmental monitoring; (6) supporting research, training, and education in HPCC technologies and applications; (7) implementing outreach activities including open meetings, workshops, and conferences; and (8) disseminating information about accomplishments, plans, and funding opportunities, plans, and funding opportunities using mechanisms such as Mosaic/WWW servers.« less
  • High-speed wide area networks are expected to enable innovative applications that integrate geographically distributed, high-performance computing, database, graphics, and networking resources. However, there is as yet little understanding of the higher-level services required to support these applications, or of the techniques required to implement these services in a scalable, secure manner. We report on a large-scale protolyping effort that has yielded some insights into these issues. Building on the hardware base provided by the I-WAY, a national-scale Asynchronous Transfer Mode (ATM) network, we developed an integrated management and application programming system, called I-Soft. This system was deployed at most ofmore » the 17 I-WAY sites and used by many of the 60 applications demonstrated on the I-WAY network. In this article, we describe the I-Soft design and report on lessons learned from application experiments.« less
  • Applications that use high-speed networks to connect geographically distributed supercomputers, databases, and scientific instruments may operate over open networks and access valuable resources. Hence, they can require mechanisms for ensuring integrity and confidentially of communications and for authenticating both users and resources. Security solutions developed for traditional client-server applications do not provide direct support for the program structures, programming tools, and performance requirements encountered in these applications. The authors address these requirements via a security-enhanced version of the Nexus communication library; which they use to provide secure versions of parallel libraries and languages, including the Message Passing Interface. These toolsmore » permit a fine degree of control over what, where, and when security mechanisms are applied. In particular, a single application can mix secure and nonsecure communication, allowing the programmer to make fine-grained security/performance tradeoffs. The authors present performance results that quantify the performance of their infrastructure.« less
  • Real-time computing has traditionally been considered largely in the context of single-processor and embedded systems, and indeed, the terms real-time computing, embedded systems, and control systems are often mentioned in closely related contexts. However, real-time computing in the context of multinode systems, specifically high-performance, cluster-computing systems, remains relatively unexplored, largely due to the fact that until now, there has not been a need for such an environment. In this paper, we motivate the need for a cluster computing infrastructure capable of supporting computation over large datasets in real-time. Our motivating example is an analytical framework to support the next generationmore » North American power grid, which is growing both in size and complexity. With streaming sensor data in the future power grid potentially reaching rates on the order of terabytes per day, the task of analyzing this data subject to real-time guarantees becomes a daunting task which will require the power of high-performance cluster computing capable of functioning under real-time constraints. One specific challenge that such an environment presents is the need for real-time networked communication between cluster nodes. In this paper, we discuss the need for real-time high-performance cluster computation, along with our work-in-progress towards an infrastructure which will ultimately enable such an environment.« less