Orchestrating Bulk Data Movement in Grid Environments
Data Grids provide a convenient environment for researchers to manage and access massively distributed bulk data by addressing several system and transfer challenges inherent to these environments. This work addresses issues involved in the efficient selection and access of replicated data in Grid environments in the context of the Globus Toolkit{trademark}, building middleware that (1) selects datasets in highly replicated environments, enabling efficient scheduling of data transfer requests; (2) predicts transfer times of bulk wide-area data transfers using extensive statistical analysis; and (3) co-allocates bulk data transfer requests, enabling parallel downloads from mirrored sites. These efforts have demonstrated a decentralized data scheduling architecture, a set of forecasting tools that predict bandwidth availability within 15% error and co-allocation architecture, and heuristics that expedites data downloads by up to 2 times.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 885937
- Report Number(s):
- ORNL/TM-2004/121; TRN: US200617%%469
- Country of Publication:
- United States
- Language:
- English
Similar Records
Application experiences with the Globus toolkit.
Globus : a metacomputing infrastructure toolkit.