skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Reliability analysis and optimization in the design of distributed systems

Book ·
OSTI ID:7204061

Reliability measures and efficient evaluation algorithms are presented to aid in designing reliable distributed systems. The terminal reliability between a pair of computers is a good measure in computer networks. For distributed systems, to capture more effectively the redundancy in resources, such as programs and files, two new reliability measures are introduced. These measures are Distributed Program Reliability (DPR) and Distributed System Reliability (DSR). A simple and efficient algorithm, SYREL, is developed to evaluate the reliability between two computing centers. This algorithm incorporates conditional probability, set theory, and Boolean algebra in a distinct approach to achieve fast execution times and obtain compact expressions. An elegant and unified approach based on graph-theoretic techniques is used in developing algorithms to evaluate DPR and DSR measures. It performs a breadth-first search on the graph representing a given distributed system to enumerate all the subgraphs that guarantee the proper accessibility for executing the given tasks(s). These subgraphs are then used to evaluate the desired reliabilities. Several optimization algorithms are developed for designing reliable systems under a cost constraint.

OSTI ID:
7204061
Resource Relation:
Other Information: Thesis
Country of Publication:
United States
Language:
English