Optimal file-bundle caching algorithms for data-grids
Conference
·
OSTI ID:824286
- LBNL Library
The file-bundle caching problem arises frequently in scientific applications where jobs need to process several files simultaneously. Consider a host system in a data-grid that maintains a staging disk or disk cache for servicing jobs of file requests. In this environment, a job can only be serviced if all its file requests are present in the disk cache. Files must be admitted into the cache or replaced in sets of file-bundles, i.e. the set of files that must all be processed simultaneously. In this paper we show that traditional caching algorithms based on file popularity measures do not perform well in such caching environments since they are not sensitive to the inter-file dependencies and may hold in the cache non-relevant combinations of files. We present and analyze a new caching algorithm for maximizing the throughput of jobs and minimizing data replacement costs to such data-grid hosts. We tested the new algorithm using a disk cache simulation model under a wide range of conditions such as file request distributions, relative cache size, file size distribution, etc. In all these cases, the results show significant improvement as compared with traditional caching algorithms.
- Research Organization:
- Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
- Sponsoring Organization:
- USDOE Director. Office of Science. Office of Advanced Scientific Computing Research, Office of Laboratory Policy and Infrastructure Management (US)
- DOE Contract Number:
- AC03-76SF00098
- OSTI ID:
- 824286
- Report Number(s):
- LBNL--54881
- Country of Publication:
- United States
- Language:
- English
Similar Records
File caching in data intensive scientific applications
Efficient algorithms for multi-file caching
Impact of admission and cache replacement policies on response times of jobs on data grids
Conference
·
Sun Jul 18 00:00:00 EDT 2004
·
OSTI ID:882745
Efficient algorithms for multi-file caching
Conference
·
Sun Mar 14 23:00:00 EST 2004
·
OSTI ID:824285
Impact of admission and cache replacement policies on response times of jobs on data grids
Conference
·
Mon Apr 21 00:00:00 EDT 2003
·
OSTI ID:813393