skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Learning concave-convex profiles of data transport over dedicated connections

Abstract

Dedicated data transport infrastructures are increasingly being deployed to support distributed big-data and high-performance computing scenarios. These infrastructures employ data transfer nodes that use sophisticated software stacks to support network transport among sites, which often house distributed file and storage systems. Throughput measurements collected over such infrastructures for a range of round trip times (RTTs) reflect the underlying complex end-to-end connections, and have revealed dichotomous throughput profiles as functions of RTT. In particular, concave regions of throughput profiles at lower RTTs indicate near-optimal performance, and convex regions at higher RTTs indicate bottlenecks due to factors such as buffer or credit limits. We present a machine learning method that explicitly infers these concave and convex regions and transitions between them using sigmoid functions. We also provide distribution-free confidence estimates for the generalization error of these concave-convex profile estimates. Throughput profiles for data transfers over 10 Gbps connections with 0–366 ms RTT provide important performance insights, including the near optimality of transfers performed with the XDD tool between XFS filesystems, and the performance limits of wide-area Lustre extensions using LNet routers. A direct application of generic machine learning packages does not adequately highlight these critical performance regions or provide as precise confidencemore » estimates.« less

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [2];  [2];  [2]
  1. ORNL
  2. Argonne National Laboratory (ANL)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1513419
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Journal Volume: 11407; Conference: International Conference on Machine Learning for Networking (MLN) - Paris, , France - 11/27/2018 10:00:00 AM-11/29/2018 10:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Rao, Nageswara S., Sen, Satyabrata, Liu, Zhengchun, Kettimuthu, R., and Foster, Ian. Learning concave-convex profiles of data transport over dedicated connections. United States: N. p., 2019. Web. doi:10.1007/978-3-030-19945-6_1.
Rao, Nageswara S., Sen, Satyabrata, Liu, Zhengchun, Kettimuthu, R., & Foster, Ian. Learning concave-convex profiles of data transport over dedicated connections. United States. doi:10.1007/978-3-030-19945-6_1.
Rao, Nageswara S., Sen, Satyabrata, Liu, Zhengchun, Kettimuthu, R., and Foster, Ian. Wed . "Learning concave-convex profiles of data transport over dedicated connections". United States. doi:10.1007/978-3-030-19945-6_1. https://www.osti.gov/servlets/purl/1513419.
@article{osti_1513419,
title = {Learning concave-convex profiles of data transport over dedicated connections},
author = {Rao, Nageswara S. and Sen, Satyabrata and Liu, Zhengchun and Kettimuthu, R. and Foster, Ian},
abstractNote = {Dedicated data transport infrastructures are increasingly being deployed to support distributed big-data and high-performance computing scenarios. These infrastructures employ data transfer nodes that use sophisticated software stacks to support network transport among sites, which often house distributed file and storage systems. Throughput measurements collected over such infrastructures for a range of round trip times (RTTs) reflect the underlying complex end-to-end connections, and have revealed dichotomous throughput profiles as functions of RTT. In particular, concave regions of throughput profiles at lower RTTs indicate near-optimal performance, and convex regions at higher RTTs indicate bottlenecks due to factors such as buffer or credit limits. We present a machine learning method that explicitly infers these concave and convex regions and transitions between them using sigmoid functions. We also provide distribution-free confidence estimates for the generalization error of these concave-convex profile estimates. Throughput profiles for data transfers over 10 Gbps connections with 0–366 ms RTT provide important performance insights, including the near optimality of transfers performed with the XDD tool between XFS filesystems, and the performance limits of wide-area Lustre extensions using LNet routers. A direct application of generic machine learning packages does not adequately highlight these critical performance regions or provide as precise confidence estimates.},
doi = {10.1007/978-3-030-19945-6_1},
journal = {},
issn = {0302-9743},
number = ,
volume = 11407,
place = {United States},
year = {2019},
month = {5}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

Software as a service for data scientists
journal, February 2012

  • Allen, Bryce; Pickett, Karl; Tuecke, Steven
  • Communications of the ACM, Vol. 55, Issue 2
  • DOI: 10.1145/2076450.2076468

UDT: UDP-based data transfer for high-speed wide area networks
journal, May 2007


Nonparametric estimation and classification using radial basis function nets and empirical risk minimization
journal, March 1996

  • Krzyzak, A.; Linder, T.; Lugosi, C.
  • IEEE Transactions on Neural Networks, Vol. 7, Issue 2
  • DOI: 10.1109/72.485681

Simple sample bound for feedforward sigmoid networks with bounded weights
journal, November 1999


Overlay Networks of In Situ Instruments for Probabilistic Guarantees on Message Delays in Wide-Area Networks
journal, January 2004