skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: mdtmFTP and its evaluation on ESNET SDN testbed

Abstract

In this paper, to address the high-performance challenges of data transfer in the big data era, we are developing and implementing mdtmFTP: a high-performance data transfer tool for big data. mdtmFTP has four salient features. First, it adopts an I/O centric architecture to execute data transfer tasks. Second, it more efficiently utilizes the underlying multicore platform through optimized thread scheduling. Third, it implements a large virtual file mechanism to address the lots-of-small-files (LOSF) problem. In conclusion, mdtmFTP integrates multiple optimization mechanisms, including–zero copy, asynchronous I/O, pipelining, batch processing, and pre-allocated buffer pools–to enhance performance. mdtmFTP has been extensively tested and evaluated within the ESNET 100G testbed. Evaluations show that mdtmFTP can achieve higher performance than existing data transfer tools, such as GridFTP, FDT, and BBCP.

Authors:
ORCiD logo [1];  [1];  [1];  [2]
  1. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
  2. ESnet, Berkeley, CA (United States)
Publication Date:
Research Org.:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); ESnet, Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1358099
Report Number(s):
FERMILAB-CONF-17-123-CD
Journal ID: ISSN 0167-739X; 1599032
Grant/Contract Number:
AC02-07CH11359
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Future Generations Computer Systems
Additional Journal Information:
Journal Volume: 79; Journal Issue: 1; Journal ID: ISSN 0167-739X
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Multicore; Data transfer; High-speed networking

Citation Formats

Zhang, Liang, Wu, Wenji, DeMar, Phil, and Pouyoul, Eric. mdtmFTP and its evaluation on ESNET SDN testbed. United States: N. p., 2017. Web. doi:10.1016/j.future.2017.04.024.
Zhang, Liang, Wu, Wenji, DeMar, Phil, & Pouyoul, Eric. mdtmFTP and its evaluation on ESNET SDN testbed. United States. doi:10.1016/j.future.2017.04.024.
Zhang, Liang, Wu, Wenji, DeMar, Phil, and Pouyoul, Eric. Fri . "mdtmFTP and its evaluation on ESNET SDN testbed". United States. doi:10.1016/j.future.2017.04.024. https://www.osti.gov/servlets/purl/1358099.
@article{osti_1358099,
title = {mdtmFTP and its evaluation on ESNET SDN testbed},
author = {Zhang, Liang and Wu, Wenji and DeMar, Phil and Pouyoul, Eric},
abstractNote = {In this paper, to address the high-performance challenges of data transfer in the big data era, we are developing and implementing mdtmFTP: a high-performance data transfer tool for big data. mdtmFTP has four salient features. First, it adopts an I/O centric architecture to execute data transfer tasks. Second, it more efficiently utilizes the underlying multicore platform through optimized thread scheduling. Third, it implements a large virtual file mechanism to address the lots-of-small-files (LOSF) problem. In conclusion, mdtmFTP integrates multiple optimization mechanisms, including–zero copy, asynchronous I/O, pipelining, batch processing, and pre-allocated buffer pools–to enhance performance. mdtmFTP has been extensively tested and evaluated within the ESNET 100G testbed. Evaluations show that mdtmFTP can achieve higher performance than existing data transfer tools, such as GridFTP, FDT, and BBCP.},
doi = {10.1016/j.future.2017.04.024},
journal = {Future Generations Computer Systems},
number = 1,
volume = 79,
place = {United States},
year = {Fri Apr 21 00:00:00 EDT 2017},
month = {Fri Apr 21 00:00:00 EDT 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 1work
Citation information provided by
Web of Science

Save / Share:
  • We consider a scenario of two sites connected over a dedicated, long-haul connection that must quickly fail-over in response to degradations in host-to-host application performance. The traditional layer-2/3 hot stand-by fail-over solutions do not adequately address the variety of application degradations, and more recent single controller Software Defined Networks (SDN) solutions are not effective for long-haul connections. We present two methods for such a path fail-over using OpenFlow enabled switches: (a) a light-weight method that utilizes host scripts to monitor application performance and dpctl API for switching, and (b) a generic method that uses two OpenDaylight (ODL) controllers and RESTmore » interfaces. For both methods, the restoration dynamics of applications contain significant statistical variations due to the complexities of controllers, north bound interfaces and switches; they, together with the wide variety of vendor implementations, complicate the choice among such solutions. We develop the impulse-response method based on regression functions of performance parameters to provide a rigorous and objective comparison of different solutions. We describe testing results of the two proposed methods, using TCP throughput and connection rtt as main parameters, over a testbed consisting of HP and Cisco switches connected over longhaul connections emulated in hardware by ANUE devices. Lastly, the combination of analytical and experimental results demonstrate that the dpctl method responds seconds faster than the ODL method on average, even though both methods eventually restore original TCP throughput.« less
  • Well-controlled experiments that directly compare seasonal algal productivities across geographically distinct locations have not been reported before. To fill this gap, six cultivation testbed facilities were chosen across the United States to evaluate different climatic zones with respect to algal biomass productivity potential. The geographical locations and climates were as follows: Southwest, desert; Western, coastal; Southeast, inland; Southeast, coastal; Pacific, tropical; and Midwest, greenhouse. The testbed facilities were equipped with identical systems for inoculum production and open pond operation and methods were standardized across all testbeds to ensure accurate measurement of physical and biological variables. The ability of the testbedmore » sites to culture and analyze the same algal species, Nannochloropsis oceanica KA32, using identical pond operational and data collection procedures was evaluated during the same seasonal timeframe. This manuscript describes the results of a first-of-its-kind coordinated testbed validation field study while providing critical details on how geographical variations in temperature, light, and weather variables influenced algal productivity, nitrate consumption, and biomass composition. We found distinct differences in growth characteristics due to the geographic location and the resulting climatic and seasonal conditions across the sites, with the highest productivities observed at the desert Southwest and tropical Pacific regions, followed by the Western coastal region. The lowest productivities were observed at the Southeast inland and Midwest greenhouse locations. These differences in productivities among the sites correlated with the differences in pond water temperature and available solar radiation. In addition two sites, the tropical Pacific and Southeast inland experienced unusual events, spontaneous flocculation, and unusually cold and wet (rainfall) conditions respectively, that negatively affected outdoor algal growth. In addition, minor variability in productivity was observed between the different experimental treatments at each site, much smaller compared to differences due to geographic location. Finally, the successful demonstration of the coordinated and standardized operation of the testbed sites established a rigorous basis for future validation of algal strains and operational conditions and protocols across a geographically diverse testbed network.« less
    Cited by 1
  • Well-controlled experiments that directly compare seasonal algal productivities across geographically distinct locations have not been reported before. To fill this gap, six cultivation testbed facilities were chosen across the United States to evaluate different climatic zones with respect to algal biomass productivity potential. The geographical locations and climates were as follows: Southwest, desert; Western, coastal; Southeast, inland; Southeast, coastal; Pacific, tropical; and Midwest, greenhouse. The testbed facilities were equipped with identical systems for inoculum production and open pond operation and methods were standardized across all testbeds to ensure accurate measurement of physical and biological variables. The ability of the testbedmore » sites to culture and analyze the same algal species, Nannochloropsis oceanica KA32, using identical pond operational and data collection procedures was evaluated during the same seasonal timeframe. This manuscript describes the results of a first-of-its-kind coordinated testbed validation field study while providing critical details on how geographical variations in temperature, light, and weather variables influenced algal productivity, nitrate consumption, and biomass composition. We found distinct differences in growth characteristics due to the geographic location and the resulting climatic and seasonal conditions across the sites, with the highest productivities observed at the desert Southwest and tropical Pacific regions, followed by the Western coastal region. The lowest productivities were observed at the Southeast inland and Midwest greenhouse locations. These differences in productivities among the sites correlated with the differences in pond water temperature and available solar radiation. In addition two sites, the tropical Pacific and Southeast inland experienced unusual events, spontaneous flocculation, and unusually cold and wet (rainfall) conditions respectively, that negatively affected outdoor algal growth. In addition, minor variability in productivity was observed between the different experimental treatments at each site, much smaller compared to differences due to geographic location. Finally, the successful demonstration of the coordinated and standardized operation of the testbed sites established a rigorous basis for future validation of algal strains and operational conditions and protocols across a geographically diverse testbed network.« less