DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Storing files in a parallel computing system based on user-specified parser function

Abstract

Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.

Inventors:
; ; ; ; ;
Issue Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1160235
Patent Number(s):
8868576
Application Number:
13/536,369
Assignee:
EMC Corporation (Hopkinton, MA)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
AC52-06NA25396
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Jun 28
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Faibish, Sorin, Bent, John M, Tzelnic, Percy, Grider, Gary, Manzanares, Adam, and Torres, Aaron. Storing files in a parallel computing system based on user-specified parser function. United States: N. p., 2014. Web.
Faibish, Sorin, Bent, John M, Tzelnic, Percy, Grider, Gary, Manzanares, Adam, & Torres, Aaron. Storing files in a parallel computing system based on user-specified parser function. United States.
Faibish, Sorin, Bent, John M, Tzelnic, Percy, Grider, Gary, Manzanares, Adam, and Torres, Aaron. Tue . "Storing files in a parallel computing system based on user-specified parser function". United States. https://www.osti.gov/servlets/purl/1160235.
@article{osti_1160235,
title = {Storing files in a parallel computing system based on user-specified parser function},
author = {Faibish, Sorin and Bent, John M and Tzelnic, Percy and Grider, Gary and Manzanares, Adam and Torres, Aaron},
abstractNote = {Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2014},
month = {10}
}

Works referenced in this record:

Method and system for a batch parser
patent, May 2010


Systems and methods for managing portions of files in multi-tier storage systems
patent, January 2013


Storage of row-column data
patent-application, February 2003


Multi-Model Access To Data
patent-application, September 2008


Query Execution Systems and Methods
patent-application, December 2011


Data Loading Systems and Methods
patent-application, December 2011


System and Method for Data Stream Processing
patent-application, March 2012


User Defined Functions for Data Loading
patent-application, September 2012


PLFS: a checkpoint filesystem for parallel applications
conference, January 2009


    Works referencing / citing this record:

    Method and system for data transfer between compute clusters and file system
    patent, April 2017


    Multi-tier caching
    patent, May 2016


    Architecture and method for a burst buffer using flash technology
    patent, March 2016