GridFTP pipelining.
GridFTP is an exceptionally fast transfer protocol for large volumes of data. Implementations of it are widely deployed and used on well-connected Grid environments such as those of the TeraGrid because of its ability to scale to network speeds. However, when the data is partitioned into many small files instead of few large files, it suffers from lower transfer rates. The latency between the serialized transfer requests of each file directly detracts from the amount of time data pathways are active, thus lowering achieved throughput. Further, when a data pathway is inactive, the TCP window closes, and TCP must go through the slow-start algorithm. The performance penalty can be severe. This situation is known as the 'lots of small files' problem. In this paper we introduce a solution to this problem. This solution, called pipelining, allows many transfer requests to be sent to the server before any one completes. Thus, pipelining hides the latency of each transfer request by sending the requests while a data transfer is in progress. We present an implementation and performance study of the pipelining solution.
- Research Organization:
- Argonne National Laboratory (ANL)
- Sponsoring Organization:
- SC; SciDAC-2 CEDPS
- DOE Contract Number:
- AC02-06CH11357
- OSTI ID:
- 971459
- Report Number(s):
- ANL/MCS/CP-58820
- Country of Publication:
- United States
- Language:
- ENGLISH
Similar Records
RXIO: Design and implementation of high performance RDMA-capable GridFTP
mdtmFTP and its evaluation on ESNET SDN testbed