skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performance Prediction of Big Data Transfer Through Experimental Analysis and Machine Learning

Conference ·
OSTI ID:1668342

Big data transfer in next-generation scientific ap- plications is now commonly carried out over connections with guaranteed bandwidth provisioned in High-performance Net- works (HPNs) through advance bandwidth reservation. To use HPN resources efficiently, provisioning agents need to carefully schedule data transfer requests and allocate appropriate band- widths. Such reserved bandwidths, if not fully utilized by the requesting user, could be simply wasted or cause extra overhead and complexity in management due to exclusive access. This calls for the capability of performance prediction to reserve bandwidth resources that match actual needs. Towards this goal, we employ machine learning algorithms to predict big data transfer perfor- mance based on extensive performance measurements, which are collected over a span of several years from a large number of data transfer tests using different protocols and toolkits between various end sites on several real-life physical or emulated HPN testbeds. We first identify a comprehensive list of attributes involved in a typical big data transfer process, including end host system configurations, network connection properties, and control parameters of data transfer methods. We then conduct an in-depth exploratory analysis of their impacts on application- level throughput, which provides insights into big data transfer performance and motivates the use of machine learning. We also investigate the applicability of machine learning algorithms and derive their general performance bounds for performance prediction of big data transfer in HPNs. Experimental results show that, with appropriate data preprocessing, the proposed machine learning-based approach achieves 95% or higher pre- diction accuracy in up to 90% of the cases with very noisy real-life performance measurements.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
Harrisburg University of Science & Technology; National Science Foundation (NSF)
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1668342
Resource Relation:
Conference: 19th International Federation for Information Processing Networking Conference, 06/22/20 - 06/25/20, Paris, FR
Country of Publication:
United States
Language:
English

Similar Records

Performance Prediction of Big Data Transfer Through Experimental Analysis and Machine Learning
Conference · Mon Jun 01 00:00:00 EDT 2020 · OSTI ID:1668342

Exploratory analysis and performance prediction of big data transfer in High-performance Networks
Journal Article · Tue May 04 00:00:00 EDT 2021 · Engineering Applications of Artificial Intelligence · OSTI ID:1668342

On Performance Prediction of Big Data Transfer in High-performance Networks
Conference · Mon Jun 01 00:00:00 EDT 2020 · OSTI ID:1668342

Related Subjects