skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Final report

Technical Report ·
DOI:https://doi.org/10.2172/1007978· OSTI ID:1007978

High performance computational science and engineering simulations have become an increasingly important part of the scientist's problem solving toolset. A key reason is the development of widely used codes and libraries that support these applications, for example, Netlib, a collection of numerical libraries [33]. The term community codes refers to those libraries or applications that have achieved some critical level of acceptance by a user community. Many of these applications are on the high-end in terms of required resources: computation, storage, and communication. Recently, there has been considerable interest in putting such applications on-line and packaging them as network services to make them available to a wider user base. Applications such as data mining [22], theorem proving and logic [14], parallel numerical computation [8][32] are example services that are all going on-line. Transforming applications into services has been made possible by advances in packaging and interface technologies including component systems [2][6][13][28][37], proposed communication standards [34], and newer Web technologies such as Web Services [38]. Network services allow the user to focus on their application and obtain remote service when needed by simply invoking the service across the network. The user can be assured that the most recent version of the code or service is always provided and they do not need to install, maintain, and manage significant infrastructure to access the service. For high performance applications in particular, the user is still often required to install a code base (e.g. MPI), and therefore become involved with the tedious details of infrastructure management. In the network service model, the service provider is responsible for all of these activities and not the user. The user need not become an expert in high performance computing. An additional advantage of high-end network services is that the user need not have specialized computational resources in order to use the service, and the user need not be concerned with performance tuning. This can all be done by the service provider. We believe that the next dominant paradigm for high performance computing will be based on high-end network services. Putting high performance applications on-line will create a new generation of community services. Community services have several features which make their deployment challenging: (i) they must provide high performance, (ii) they are resource intensive, and (iii) they may be built upon a large existing code base. Many groups have built significant infrastructure for providing domain-specific high-end services [6][8][12][14][22][24][27][31][32]. However, this process is labor-intensive and time-consuming as evidenced by the development time required to build many of these systems. The reason is that these systems are all built from the ground-up with little existing infrastructure to utilize. Providing efficient, reliable, secure, and scalable services requires significant run-time infrastructure and middleware (Figure 1). The goal of this project is to develop general-purpose middleware to support the rapid deployment of high-end community services. In this proposal, we will focus on scalable middleware in support of resource management and reliability. We also propose a system architecture that integrates the middleware components. Our middleware and system architecture will be designed to accommodate and integrate middleware solutions for security and user interface1 developed by other groups. We will produce middleware that can be leveraged by community services running in clusters, supercomputers, and in Grids. One of the novel aspects of our approach is that the tension between resource sharing for the 'common good' and resource monopolization for the 'individual good' is significantly reduced. To increase the impact of this project, the middleware will be integrated into a widely used implementation of the Message-Passing Interface (MPI), MPICH from Argonne National Laboratory, and the Condor system from the University of Wisconsin. The middleware will be evaluated by applying it to high-end network services of interest to DOE.

Research Organization:
Univ. of Minnesota, Minneapolis, MN (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
FG02-03ER25554
OSTI ID:
1007978
Report Number(s):
1719-521-6365- Final Report; TRN: US201107%%35
Country of Publication:
United States
Language:
English