DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Accelerating shared file checkpoint with local burst buffers

Abstract

A data management system and method for accelerating shared file checkpointing. Written application data is aggregated in an application data file created in a local burst buffer memory at a compute node, and an associated data mapping built index to maintain information related to the offsets into a shared file at which segments of the application data is to be stored in a parallel file system, and where in the buffer those segments are located. The node asynchronously transfers a data file containing the application data and the associated data mapping index to a file server for shared file storage. The data management system and method further accelerates shared file checkpointing in which a shared file, together with a map file that specifies how the shared file is to be distributed, is asynchronously transferred to local burst buffer memories at the nodes to accelerate reading of the shared file.

Inventors:
; ;
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1892825
Patent Number(s):
11301165
Application Number:
15/963,700
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B604142
Resource Type:
Patent
Resource Relation:
Patent File Date: 04/26/2018
Country of Publication:
United States
Language:
English

Citation Formats

Gooding, Thomas, Lemarinier, Pierre, and Rosenburg, Bryan S. Accelerating shared file checkpoint with local burst buffers. United States: N. p., 2022. Web.
Gooding, Thomas, Lemarinier, Pierre, & Rosenburg, Bryan S. Accelerating shared file checkpoint with local burst buffers. United States.
Gooding, Thomas, Lemarinier, Pierre, and Rosenburg, Bryan S. Tue . "Accelerating shared file checkpoint with local burst buffers". United States. https://www.osti.gov/servlets/purl/1892825.
@article{osti_1892825,
title = {Accelerating shared file checkpoint with local burst buffers},
author = {Gooding, Thomas and Lemarinier, Pierre and Rosenburg, Bryan S.},
abstractNote = {A data management system and method for accelerating shared file checkpointing. Written application data is aggregated in an application data file created in a local burst buffer memory at a compute node, and an associated data mapping built index to maintain information related to the offsets into a shared file at which segments of the application data is to be stored in a parallel file system, and where in the buffer those segments are located. The node asynchronously transfers a data file containing the application data and the associated data mapping index to a file server for shared file storage. The data management system and method further accelerates shared file checkpointing in which a shared file, together with a map file that specifies how the shared file is to be distributed, is asynchronously transferred to local burst buffer memories at the nodes to accelerate reading of the shared file.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2022},
month = {4}
}

Works referenced in this record:

PLFS: a checkpoint filesystem for parallel applications
conference, January 2009


A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers
conference, May 2014


Architecture and method for a burst buffer using flash technology
patent, March 2016


Method and System For Data Transfer Between Compute Clusters And File System
patent-application, November 2014


Minimizing Micro-Interruptions in High-Performance Computing
patent-application, November 2014


Burst buffer appliance with small file aggregation
patent, March 2015


Integrated in-system storage architecture for high performance computing
conference, June 2012


Centralized Parallel Burst Engine for High Performance Computing
patent-application, May 2015


How Much SSD Is Useful for Resilience in Supercomputers
conference, January 2015


Metadata compression
patent, February 2020


BurstMem: A high-performance burst buffer system for scientific applications
conference, October 2014