skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Estimation of RTT and Loss Rate of Wide-Area Connections Using MPI Measurements

Conference ·

Scientific computations are expected to be increasingly distributed across wide-area networks, and the Message Passing Interface (MPI) has been shown to scale to support their communications over long distances. These computations should account for certain network parameters to ensure an effective execution, for example, by avoiding highly congested and long connections. The execution times of MPI basic operations reflect the connection parameters, including the Round Trip Time (RTT) and loss rate. We describe five machine leaning methods to estimate the connection RTT and loss rate using execution times of MPI basic operations. We utilize execution time measurements of MPI Sendrecv operations collected over emulated 10 Gbps connections with 0-366 ms round-trip times, wherein the longest connection spans the globe, under up to 20% periodic losses. These methods provide disparate, namely, linear and non-linear, and smooth and non-smooth, estimates of RTT and loss rate. Our results show that accurate estimates can be generated at low loss rates but they become inaccurate at loss rates 10% and higher. Overall, these results constitute a case study of the strengths and limitations of machine learning methods in inferring network-level parameters using application-level measurements.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1659573
Resource Relation:
Conference: Workshop Innovating the Network for Data-Intensive Science (INDIS) - Denver, Colorado, United States of America - 11/17/2019 10:00:00 AM-11/17/2019 10:00:00 AM
Country of Publication:
United States
Language:
English