Distributing File-Based Data to Remote Sites Within the BABAR Collaboration
BABAR [1] uses two formats for its data: Objectivity database and root [2] files. This poster concerns the distribution of the latter--for Objectivity data see [3]. The BABAR analysis data is stored in root files--one per physics run and analysis selection channel--maintained in a large directory tree. Currently BABAR has more than 4.5 TBytes in 200,000 root files. This data is (mostly) produced at SLAC, but is required for analysis at universities and research centers throughout the us and Europe. Two basic problems confront us when we seek to import bulk data from slac to an institute's local storage via the network. We must determine which files must be imported (depending on the local site requirements and which files have already been imported), and we must make the optimum use of the network when transferring the data. Basic ftp-like tools (ftp, scp, etc) do not attempt to solve the first problem. More sophisticated tools like rsync [4], the widely-used mirror/synchronization program, compare local and remote file systems, checking for changes (based on file date, size and, if desired, an elaborate checksum) in order to only copy new or modified files. However rsync allows for only limited file selection. Also when, as in BABAR, an extremely large directory structure must be scanned, rsync can take several hours just to determine which files need to be copied. Although rsync (and scp) provides on-the-fly compression, it does not allow us to optimize the network transfer by using multiple streams, adjusting the tcp window size, or separating encrypted authentication from unencrypted data channels.
- Research Organization:
- SLAC National Accelerator Lab., Menlo Park, CA (United States)
- Sponsoring Organization:
- USDOE Office of Energy Research (ER) (US)
- DOE Contract Number:
- AC03-76SF00515
- OSTI ID:
- 799060
- Report Number(s):
- SLAC-PUB-9180; TRN: US0204378
- Resource Relation:
- Other Information: PBD: 2 May 2002
- Country of Publication:
- United States
- Language:
- English
Similar Records
iSSH v. Auditd: Intrusion Detection in High Performance Computing
When to use rsync