skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: /Scratch as a Cache: Rethinking HPC Center Scratch Storage

Conference ·
OSTI ID:1004447
 [1];  [1];  [2]
  1. Virginia Polytechnic Institute and State University (Virginia Tech)
  2. ORNL

To sustain emerging data-intensive scientific applications, High Performance Computing (HPC) centers invest a notable fraction of their operating budget on a specialized fast storage system, scratch space, which is designed for storing the data of currently running and soon-to-run HPC jobs. Instead, it is often used as a standard file system, wherein users arbitrarily store their data, without any consideration to the center's overall performance. To remedy this, centers periodically scan the scratch in an attempt to purge transient and stale data. This practice of supporting a cache workload using a file system and disjoint tools for staging and purging results in suboptimal use of the scratch space. In this paper, we address the above issues by proposing a new perspective, where the HPC scratch space is treated as a cache, and data population, retention, and eviction tools are integrated with scratch management. In our approach, data is moved to the scratch space only when it needed, and unneeded data is removed as soon as possible. We also design a new job-workflow-aware caching policy that leverages user-supplied hints for managing the cache. Our evaluation using three-year job logs from the Jaguar supercomputer, shows that compared to the widely-used purge approach, workflow-aware caching optimizes scratch utilization by reducing the average amount of data read by 9.3%, and by reducing job scheduling delays associated with data staging, on average, by 282.0%.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1004447
Resource Relation:
Conference: Proceedings of the 23rd ACM Int l Conference on Supercomputing (ICS 09), Yorktown Heights, NY, USA, 20090608, 20090608
Country of Publication:
United States
Language:
English