skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Storing files in a parallel computing system based on user-specified parser function

Patent ·
OSTI ID:1160235

Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC52-06NA25396
Assignee:
EMC Corporation (Hopkinton, MA)
Patent Number(s):
8,868,576
Application Number:
13/536,369
OSTI ID:
1160235
Resource Relation:
Patent File Date: 2012 Jun 28
Country of Publication:
United States
Language:
English

References (9)

Method and system for a batch parser patent May 2010
Systems and methods for managing portions of files in multi-tier storage systems patent January 2013
Storage of row-column data patent-application February 2003
Multi-Model Access To Data patent-application September 2008
Query Execution Systems and Methods patent-application December 2011
Data Loading Systems and Methods patent-application December 2011
System and Method for Data Stream Processing patent-application March 2012
User Defined Functions for Data Loading patent-application September 2012
PLFS: a checkpoint filesystem for parallel applications conference January 2009

Cited By (3)

Method and system for data transfer between compute clusters and file system patent April 2017
Multi-tier caching patent May 2016
Architecture and method for a burst buffer using flash technology patent March 2016