skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Architecture and method for a burst buffer using flash technology

Abstract

A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.

Inventors:
; ; ; ; ;
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1243041
Patent Number(s):
9,286,261
Application Number:
13/676,000
Assignee:
EMC Corporation (Hopkinton, MA) Los Alamos National Security, LLC (Los Alamos, NM) LANL
DOE Contract Number:  
AC52-06NA25396
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Nov 13
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, and Chen, Hsing-bung. Architecture and method for a burst buffer using flash technology. United States: N. p., 2016. Web.
Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, & Chen, Hsing-bung. Architecture and method for a burst buffer using flash technology. United States.
Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, and Chen, Hsing-bung. Tue . "Architecture and method for a burst buffer using flash technology". United States. doi:. https://www.osti.gov/servlets/purl/1243041.
@article{osti_1243041,
title = {Architecture and method for a burst buffer using flash technology},
author = {Tzelnic, Percy and Faibish, Sorin and Gupta, Uday K. and Bent, John and Grider, Gary Alan and Chen, Hsing-bung},
abstractNote = {A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Mar 15 00:00:00 EDT 2016},
month = {Tue Mar 15 00:00:00 EDT 2016}
}

Patent:

Save / Share:

Works referenced in this record:

DASH-IO: an empirical study of flash-based IO for HPC
conference, January 2010

  • He, Jiahua; Bennett, Jeffrey; Snavely, Allan
  • Proceedings of the 2010 TeraGrid Conference on - TG '10, Article No. 10
  • DOI: 10.1145/1838574.1838584

PLFS: a checkpoint filesystem for parallel applications
conference, January 2009


A comprehensive study of energy efficiency and performance of flash-based SSD
journal, April 2011

  • Park, Seonyeong; Kim, Youngjae; Urgaonkar, Bhuvan
  • Journal of Systems Architecture, Vol. 57, Issue 4, p. 354-365
  • DOI: 10.1016/j.sysarc.2011.01.005

Storage challenges at Los Alamos National Lab
conference, April 2012


A higher order estimate of the optimum checkpoint interval for restart dumps
journal, February 2006