Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Storage of sparse files using parallel log-structured file system

Patent ·
OSTI ID:1407704
A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a single patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.
Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC52-06NA25396
Assignee:
EMC IP Holding Company LLC
Patent Number(s):
9,811,545
Application Number:
13/921,719
OSTI ID:
1407704
Country of Publication:
United States
Language:
English

References (11)

Understanding and Improving Computational Science Storage Access through Continuous Characterization journal October 2011
A static analysis of I/O characteristics of scientific applications in a production workload conference January 1993
Storage challenges at Los Alamos National Lab conference April 2012
PLFS: a checkpoint filesystem for parallel applications conference January 2009
Pattern-aware file reorganization in MPI-IO conference January 2011
Automatic arima time series modeling for adaptive I/O prefetching journal April 2004
A Plugin for HDF5 Using PLFS for Improved I/O Performance and Semantic Analysis conference November 2012
Lessons from characterizing the input/output behavior of parallel scientific applications journal June 1998
Learning to classify parallel input/output access patterns journal August 2002
Markov model prediction of I/O requests for scientific applications conference January 2002
Discovering Structure in Unstructured I/O conference November 2012