Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Optimizing End-to-End Big Data Transfers over Terabits Network Infrastructure

Journal Article · · IEEE Transactions on Parallel and Distributed Systems
 [1];  [2];  [2];  [2];  [3]
  1. Sogang Univ., Seoul (Korea, Republic of). Dept. of Computer Science and Engineering
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  3. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

While future terabit networks hold the promise of significantly improving big-data motion among geographically distributed data centers, significant challenges must be overcome even on today's 100 gigabit networks to realize end-to-end performance. Multiple bottlenecks exist along the end-to-end path from source to sink, for instance, the data storage infrastructure at both the source and sink and its interplay with the wide-area network are increasingly the bottleneck to achieving high performance. In this study, we identify the issues that lead to congestion on the path of an end-to-end data transfer in the terabit network environment, and we present a new bulk data movement framework for terabit networks, called LADS. LADS exploits the underlying storage layout at each endpoint to maximize throughput without negatively impacting the performance of shared storage resources for other users. LADS also uses the Common Communication Interface (CCI) in lieu of the sockets interface to benefit from hardware-level zero-copy, and operating system bypass capabilities when available. It can further improve data transfer performance under congestion on the end systems using buffering at the source using flash storage. With our evaluations, we show that LADS can avoid congested storage elements within the shared storage resource, improving input/output bandwidth, and data transfer rates across the high speed networks. We also investigate the performance degradation problems of LADS due to I/O contention on the parallel file system (PFS), when multiple LADS tools share the PFS. We design and evaluate a meta-scheduler to coordinate multiple I/O streams while sharing the PFS, to minimize the I/O contention on the PFS. Finally, with our evaluations, we observe that LADS with meta-scheduling can further improve the performance by up to 14 percent relative to LADS without meta-scheduling.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Sogang Univ., Seoul (Korea, Republic of); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
DOE Office of Science; USDOE; Ministry of Science, ICT and Future Planning (MSIP) (Korea, Republic of); National Research Foundation of Korea (NRF) (Korea, Republic of)
Contributing Organization:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1361284
Alternate ID(s):
OSTI ID: 1407899
Journal Information:
IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 1 Vol. 28; ISSN 1045-9219
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

Cited By (3)

Optimizing communication performance in scale-out storage system journal July 2018
Async-LCAM: a lock contention aware messenger for Ceph distributed storage system journal July 2018
New Bargaining Game Model for Collaborative Vehicular Network Services journal March 2019

Similar Records

LADS: Optimizing Data Transfers using Layout-Aware Data Scheduling
Conference · Wed Dec 31 23:00:00 EST 2014 · OSTI ID:1265306

NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure
Journal Article · Mon May 07 00:00:00 EDT 2018 · Scientific Programming · OSTI ID:1565699

Layout-Aware I/O Scheduling for Terabits Data Movement
Conference · Mon Dec 31 23:00:00 EST 2012 · OSTI ID:1097485