skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: GridFTP pipelining.

Abstract

GridFTP is an exceptionally fast transfer protocol for large volumes of data. Implementations of it are widely deployed and used on well-connected Grid environments such as those of the TeraGrid because of its ability to scale to network speeds. However, when the data is partitioned into many small files instead of few large files, it suffers from lower transfer rates. The latency between the serialized transfer requests of each file directly detracts from the amount of time data pathways are active, thus lowering achieved throughput. Further, when a data pathway is inactive, the TCP window closes, and TCP must go through the slow-start algorithm. The performance penalty can be severe. This situation is known as the 'lots of small files' problem. In this paper we introduce a solution to this problem. This solution, called pipelining, allows many transfer requests to be sent to the server before any one completes. Thus, pipelining hides the latency of each transfer request by sending the requests while a data transfer is in progress. We present an implementation and performance study of the pipelining solution.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC); SciDAC-2 CEDPS
OSTI Identifier:
971459
Report Number(s):
ANL/MCS/CP-58820
TRN: US201004%%19
DOE Contract Number:
DE-AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: TeraGrid '07; Jun. 6, 2007 - Jun. 8, 2007; Madison, WI
Country of Publication:
United States
Language:
ENGLISH
Subject:
97 MATHEMATICAL METHODS AND COMPUTING; 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; G CODES; DATA TRANSMISSION; IMPLEMENTATION; PERFORMANCE

Citation Formats

Bresnahan, J., Link, M., Kettimuthu, R., Fraser, D., Foster, I., Mathematics and Computer Science, and Univ. of Chicago. GridFTP pipelining.. United States: N. p., 2007. Web.
Bresnahan, J., Link, M., Kettimuthu, R., Fraser, D., Foster, I., Mathematics and Computer Science, & Univ. of Chicago. GridFTP pipelining.. United States.
Bresnahan, J., Link, M., Kettimuthu, R., Fraser, D., Foster, I., Mathematics and Computer Science, and Univ. of Chicago. Mon . "GridFTP pipelining.". United States. doi:.
@article{osti_971459,
title = {GridFTP pipelining.},
author = {Bresnahan, J. and Link, M. and Kettimuthu, R. and Fraser, D. and Foster, I. and Mathematics and Computer Science and Univ. of Chicago},
abstractNote = {GridFTP is an exceptionally fast transfer protocol for large volumes of data. Implementations of it are widely deployed and used on well-connected Grid environments such as those of the TeraGrid because of its ability to scale to network speeds. However, when the data is partitioned into many small files instead of few large files, it suffers from lower transfer rates. The latency between the serialized transfer requests of each file directly detracts from the amount of time data pathways are active, thus lowering achieved throughput. Further, when a data pathway is inactive, the TCP window closes, and TCP must go through the slow-start algorithm. The performance penalty can be severe. This situation is known as the 'lots of small files' problem. In this paper we introduce a solution to this problem. This solution, called pipelining, allows many transfer requests to be sent to the server before any one completes. Thus, pipelining hides the latency of each transfer request by sending the requests while a data transfer is in progress. We present an implementation and performance study of the pipelining solution.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2007},
month = {Mon Jan 01 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: