skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: High-performance End-to-End Integrity Verification on Big Data Transfer

Journal Article · · IEICE Transactions on Information and Systems

The scale of scientific data generated by experimental facilities and simulations in high-performance computing facilities has been proliferating with the emergence of IoT-based big data. In many cases, this data must be transmitted rapidly and reliably to remote facilities for storage, analysis, or sharing, for the Internet of Things (IoT) applications. Simultaneously, IoT data can be verified using a checksum after the data has been written to the disk at the destination to ensure its integrity. However, this end-to-end integrity verification inevitably creates overheads (extra disk I/O and more computation). Thus, the overall data transfer time increases. In this article, we evaluate strategies to maximize the overlap between data transfer and checksum computation for astronomical observation data. Specifically, we examine file-level and block-level (with various block sizes) pipelining to overlap data transfer and checksum computation. We analyze these pipelining approaches in the context of GridFTP, a widely used protocol for scientific data transfers. Theoretical analysis and experiments are conducted to evaluate our methods. The results show that block-level pipelining is effective in maximizing the overlap mentioned above, and can improve the overall data transfer time with end-to-end integrity verification by up to 70% compared to the sequential execution of transfer and checksum, and by up to 60% compared to file-level pipelining.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES); National Science Foundation (NSF); Hongik University
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1573256
Journal Information:
IEICE Transactions on Information and Systems, Vol. E102D, Issue 8
Country of Publication:
United States
Language:
English