skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Parallel checksumming of data chunks of a shared data object using a log-structured file system

Abstract

Checksum values are generated and used to verify the data integrity. A client executing in a parallel computing system stores a data chunk to a shared data object on a storage node in the parallel computing system. The client determines a checksum value for the data chunk; and provides the checksum value with the data chunk to the storage node that stores the shared object. The data chunk can be stored on the storage node with the corresponding checksum value as part of the shared object. The storage node may be part of a Parallel Log-Structured File System (PLFS), and the client may comprise, for example, a Log-Structured File System client on a compute node or burst buffer. The checksum value can be evaluated when the data chunk is read from the storage node to verify the integrity of the data that is read.

Inventors:
; ;
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1320885
Patent Number(s):
9,436,722
Application Number:
13/799,264
Assignee:
Los Alamos National Security, LLC (Los Alamos, NM) LANL
DOE Contract Number:
AC52-06NA25396
Resource Type:
Patent
Resource Relation:
Patent File Date: 2013 Mar 13
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Bent, John M., Faibish, Sorin, and Grider, Gary. Parallel checksumming of data chunks of a shared data object using a log-structured file system. United States: N. p., 2016. Web.
Bent, John M., Faibish, Sorin, & Grider, Gary. Parallel checksumming of data chunks of a shared data object using a log-structured file system. United States.
Bent, John M., Faibish, Sorin, and Grider, Gary. 2016. "Parallel checksumming of data chunks of a shared data object using a log-structured file system". United States. doi:. https://www.osti.gov/servlets/purl/1320885.
@article{osti_1320885,
title = {Parallel checksumming of data chunks of a shared data object using a log-structured file system},
author = {Bent, John M. and Faibish, Sorin and Grider, Gary},
abstractNote = {Checksum values are generated and used to verify the data integrity. A client executing in a parallel computing system stores a data chunk to a shared data object on a storage node in the parallel computing system. The client determines a checksum value for the data chunk; and provides the checksum value with the data chunk to the storage node that stores the shared object. The data chunk can be stored on the storage node with the corresponding checksum value as part of the shared object. The storage node may be part of a Parallel Log-Structured File System (PLFS), and the client may comprise, for example, a Log-Structured File System client on a compute node or burst buffer. The checksum value can be evaluated when the data chunk is read from the storage node to verify the integrity of the data that is read.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 9
}

Patent:

Save / Share:
  • Techniques are provided for parallel compression of data chunks being written to a shared object. A client executing on a compute node or a burst buffer node in a parallel computing system stores a data chunk generated by the parallel computing system to a shared data object on a storage node by compressing the data chunk; and providing the data compressed data chunk to the storage node that stores the shared object. The client and storage node may employ Log-Structured File techniques. The compressed data chunk can be de-compressed by the client when the data chunk is read. A storagemore » node stores a data chunk as part of a shared object by receiving a compressed version of the data chunk from a compute node; and storing the compressed version of the data chunk to the shared data object on the storage node.« less
  • A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a singlemore » patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.« less
  • Collective buffering and data pattern solutions are provided for storage, retrieval, and/or analysis of data in a collective parallel processing environment. For example, a method can be provided for data storage in a collective parallel processing environment. The method comprises receiving data to be written for a plurality of collective processes within a collective parallel processing environment, extracting a data pattern for the data to be written for the plurality of collective processes, generating a representation describing the data pattern, and saving the data and the representation.
  • Interactive requests are processed from users of log-in nodes. A metadata server node is provided for use in a file system shared by one or more interactive nodes and one or more batch nodes. The interactive nodes comprise interactive clients to execute interactive tasks and the batch nodes execute batch jobs for one or more batch clients. The metadata server node comprises a virtual machine monitor; an interactive client proxy to store metadata requests from the interactive clients in an interactive client queue; a batch client proxy to store metadata requests from the batch clients in a batch client queue;more » and a metadata server to store the metadata requests from the interactive client queue and the batch client queue in a metadata queue based on an allocation of resources by the virtual machine monitor. The metadata requests can be prioritized, for example, based on one or more of a predefined policy and predefined rules.« less
  • The present invention provides a data de-noising system utilizing processors and wavelet denoising techniques. Data is read and displayed in different formats. The data is partitioned into regions and the regions are distributed onto the processors. Communication requirements are determined among the processors according to the wavelet denoising technique and the partitioning of the data. The data is transforming onto different multiresolution levels with the wavelet transform according to the wavelet denoising technique, the communication requirements, and the transformed data containing wavelet coefficients. The denoised data is then transformed into its original reading and displaying data format.