Global to push GA events into
skip to main content

Title: Architecture and method for a burst buffer using flash technology

A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.
; ; ; ; ;
Issue Date:
OSTI Identifier:
EMC Corporation (Hopkinton, MA) Los Alamos National Security, LLC (Los Alamos, NM) LANL
Patent Number(s):
Application Number:
Contract Number:
Resource Relation:
Patent File Date: 2012 Nov 13
Research Org:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org:
Country of Publication:
United States

Works referenced in this record:

DASH-IO: an empirical study of flash-based IO for HPC
conference, January 2010
  • He, Jiahua; Bennett, Jeffrey; Snavely, Allan
  • Proceedings of the 2010 TeraGrid Conference on - TG '10, Article No. 10
  • DOI: 10.1145/1838574.1838584

PLFS: a checkpoint filesystem for parallel applications
conference, January 2009

A comprehensive study of energy efficiency and performance of flash-based SSD
journal, April 2011
  • Park, Seonyeong; Kim, Youngjae; Urgaonkar, Bhuvan
  • Journal of Systems Architecture, Vol. 57, Issue 4, p. 354-365
  • DOI: 10.1016/j.sysarc.2011.01.005

Storage challenges at Los Alamos National Lab
conference, April 2012

A higher order estimate of the optimum checkpoint interval for restart dumps
journal, February 2006