skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Report for CS 698-95 ?Directed Research ? Performance Modeling:? Using Queueing Network Modeling to Analyze the University of San Francisco Keck Cluster Supercomputer

Technical Report ·
DOI:https://doi.org/10.2172/883809· OSTI ID:883809

In today's world, the need for computing power is becoming more pressing daily. Our need to process, analyze, and store data is quickly exceeding the capabilities of small self-contained serial machines, such as the modern desktop PC. Initially, this gap was filled by the creation of supercomputers: large-scale self-contained parallel machines. However, current markets, as well as the costs to develop and maintain such machines, are quickly making such machines a rarity, used only in highly specialized environments. A third type of machine exists, however. This relatively new type of machine, known as a cluster, is built from common, and often inexpensive, commodity self-contained desktop machines. But how well do these clustered machines work? There have been many attempts to quantify the performance of clustered computers. One approach, Queueing Network Modeling (QNM), appears to be a potentially useful and rarely tried method of modeling such systems. QNM, which has its beginnings in the modeling of traffic patterns, has expanded, and is now used to model everything from CPU and disk services, to computer systems, to service rates in store checkout lines. This history of successful usage, as well as the correspondence of QNM components to commodity clusters, suggests that QNM can be a useful tool for both the cluster designer, interested in the best value for the cost, and the user of existing machines, interested in performance rates and time-to-solution. So, what is QNM? Queueing Network Modeling is an approach to computer system modeling where the computer is represented as a network of queues and evaluated analytically. How does this correspond to clusters? There is a neat one-to-one relationship between the components of a QNM model and a cluster. For example: A cluster is made from a combination of computational nodes and network switches. Both of these fit nicely with the QNM descriptions of service centers (delay, queueing, and load-dependent). Other examples include relationships between different classes of customers in QNM and different types of messages passed on clustered systems, and the obvious relationship between the QNM model queues and message queueing in switches and network cards. Even the parameterization of QNM components lends itself well to cluster modeling. Numbers of service centers (computational nodes and switches) is generally well known for existing systems, and can be estimated for potential systems. Number of customers in the system can be estimated based on application call traces or profiles. Timing rates and service demands, too, can be estimated based on device specifications, or through application tracing or profiling. Typical results reported include throughputs, queue lengths, and response times, all of which are important to determining how well a system is utilized. In this research, QNM is applied to the Keck Cluster as a strong scaling problem. Strong scaling is where the size of the problem to be solved remains constant even as the number of processors allocated to the solution increases. QNM could also be applied in a weak scaling manner, meaning the problem size increases as the number of allocated processors increases, but this application is not investigated here.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
883809
Report Number(s):
UCRL-TR-216011; TRN: US200615%%260
Country of Publication:
United States
Language:
English