Reliability analysis and optimization in the design of distributed systems
Reliability measures and efficient evaluation algorithms are presented to aid in designing reliable distributed systems. The terminal reliability between a pair of computers is a good measure in computer networks. For distributed systems, to capture more effectively the redundancy in resources, such as programs and files, two new reliability measures are introduced. These measures are Distributed Program Reliability (DPR) and Distributed System Reliability (DSR). A simple and efficient algorithm, SYREL, is developed to evaluate the reliability between two computing centers. This algorithm incorporates conditional probability, set theory, and Boolean algebra in a distinct approach to achieve fast execution times and obtain compact expressions. An elegant and unified approach based on graph-theoretic techniques is used in developing algorithms to evaluate DPR and DSR measures. It performs a breadth-first search on the graph representing a given distributed system to enumerate all the subgraphs that guarantee the proper accessibility for executing the given tasks(s). These subgraphs are then used to evaluate the desired reliabilities. Several optimization algorithms are developed for designing reliable systems under a cost constraint.
- OSTI ID:
- 7204061
- Resource Relation:
- Other Information: Thesis
- Country of Publication:
- United States
- Language:
- English
Similar Records
Concurrent computation of power system reliability: Final report, December 1988
Distributed input/output processing in data-driven multiprocessors