Method and apparatus for offloading compute resources to a flash co-processing appliance
Abstract
Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.
- Inventors:
- Issue Date:
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1223101
- Patent Number(s):
- 9158540
- Application Number:
- 13/676,019
- Assignee:
- EMC Corporation (Hopkinton, MA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- AC52-06NA25396
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2012 Nov 13
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, and Chen, Hsing -bung. Method and apparatus for offloading compute resources to a flash co-processing appliance. United States: N. p., 2015.
Web.
Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, & Chen, Hsing -bung. Method and apparatus for offloading compute resources to a flash co-processing appliance. United States.
Tzelnic, Percy, Faibish, Sorin, Gupta, Uday K., Bent, John, Grider, Gary Alan, and Chen, Hsing -bung. Tue .
"Method and apparatus for offloading compute resources to a flash co-processing appliance". United States. https://www.osti.gov/servlets/purl/1223101.
@article{osti_1223101,
title = {Method and apparatus for offloading compute resources to a flash co-processing appliance},
author = {Tzelnic, Percy and Faibish, Sorin and Gupta, Uday K. and Bent, John and Grider, Gary Alan and Chen, Hsing -bung},
abstractNote = {Solid-State Drive (SSD) burst buffer nodes are interposed into a parallel supercomputing cluster to enable fast burst checkpoint of cluster memory to or from nearby interconnected solid-state storage with asynchronous migration between the burst buffer nodes and slower more distant disk storage. The SSD nodes also perform tasks offloaded from the compute nodes or associated with the checkpoint data. For example, the data for the next job is preloaded in the SSD node and very fast uploaded to the respective compute node just before the next job starts. During a job, the SSD nodes perform fast visualization and statistical analysis upon the checkpoint data. The SSD nodes can also perform data reduction and encryption of the checkpoint data.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {10}
}
Works referenced in this record:
Data storage system having separate data transfer section and message network
patent, October 2006
- Ofek, Yuval; Black, David; MacArthur, Stephen D.
- US Patent Document 7,117,275
Distributed maintenance of snapshot copies by a primary processor managing metadata and a secondary processor providing read-write access to a production dataset
patent, March 2010
- Faibish, Sorin; Fridella, Stephen; Gupta, Uday
- US Patent Document 7,676,514
Methods, systems, and computer program products for providing access to shared storage by computing grids and clusters with large numbers of nodes
patent, March 2010
- Compton, James T.; Gupta, Uday; Faibish, Sorin
- US Patent Document 7,676,628
Network file server sharing local caches of file access information in data processors assigned to respective file systems
patent, June 2010
- Vahalia, Uresh K.; Gupta, Uday; Porat, Betti
- US Patent Document 7,739,379
Techniques for using flash-based memory as a write cache and a vault
patent, September 2010
- Gupta, Uday; Hopkins, Charles H.; Evans, Michael B.
- US Patent Document 7,793,061
Pre-allocation and hierarchical mapping of data blocks distributed from a first processor to a second processor for use in a file system
patent, May 2011
- Faibish, Sorin; Fridella, Stephen; Jiang, Xiaoye
- US Patent Document 7,945,726
Efficient read/write algorithms and associated mapping for block-level data reduction processes
patent, March 2012
- Raizen, Helen S.; Bappe, Michael E.; Nikolaevich, Agarkov Vadim
- US Patent Document 8,140,821
Techniques for using flash-based memory in recovery processing
patent, October 2012
- Gupta, Uday; Hopkins, Charles H.; Evans, Michael B.
- US Patent Document 8,296,534
Apparatus For Enhancing Performance Of A Parallel Processing Environment, And Associated Methods
patent-application, July 2010
- Howard, Kevin D.
- US Patent Application 12/750338; 20100185719
PLFS: a checkpoint filesystem for parallel applications
conference, January 2009
- Bent, John; Gibson, Garth; Grider, Gary
A Cost-Effective, High Bandwidth Server I/O network Architecture for Cluster Systems
conference, March 2007
- Chen, Hsing-bung; Grider, Gary; Fields, Parks
- 2007 IEEE International Parallel and Distributed Processing Symposium
PaScal-- A New Parallel and Scalable Server IO Networking Infrastructure for Supporting Global Storage/File Systems in Large-size Linux Clusters
conference, January 2006
- Grider, G.; Nunez, J.
- 2006 IEEE International Performance Computing and Communications Conference
Distributed-and-split data-control extension to SCSI for scalable storage area networks
conference, January 2002
- Birk, Y.; Bishara, N.
- Proceedings 10th Symposium on High Performance Interconnects
Hybrid checkpointing using emerging nonvolatile memories for future exascale systems
journal, July 2011
- Dong, Xiangyu; Xie, Yuan; Muralimanohar, Naveen
- ACM Transactions on Architecture and Code Optimization, Vol. 8, Issue 2
Evaluation of active storage strategies for the lustre parallel file system
conference, January 2007
- Piernas, Juan; Nieplocha, Jarek; Felix, Evan J.
- Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07
Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging?
book, January 2012
- Rajachandrasekar, Raghunath; Ouyang, Xiangyong; Besseron, Xavier
- Euro-Par 2011: Parallel Processing Workshops
Evaluating the benefits of an extended memory hierarchy for parallel streamline algorithms
conference, October 2011
- Camp, David; Childs, Hank; Chourasia, Amit
- 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV)
Managing storage space in a flash and disk hybrid storage system
conference, September 2009
- Xiaojian Wu, ; Reddy, A. L. N.
- amp; Simulation of Computer and Telecommunication Systems (MASCOTS), 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems
Exploiting Concurrency to Improve Latency and throughput in a Hybrid Storage System
conference, August 2010
- Wu, Xiaojian; Reddy, A. L. Narasimha
- Simulation of Computer and Telecommunication Systems (MASCOTS), 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Incorporating Network RAM and Flash into Fast Backing Store for Clusters
conference, September 2011
- Newhall, Tia; Woos, Douglas
- 2011 IEEE International Conference on Cluster Computing (CLUSTER)
The Conquest file system : Better performance through a disk/persistent-RAM hybrid design
journal, August 2006
- Wang, An-I Andy; Kuenning, Geoff; Reiher, Peter
- ACM Transactions on Storage, Vol. 2, Issue 3
Azor: Using Two-Level Block Selection to Improve SSD-Based I/O Caches
conference, July 2011
- Klonatos, Yannis; Makatos, Thanos; Marazakis, Manolis
- 2011 6th IEEE International Conference on Networking, Architecture, and Storage (NAS), 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage
Using Active NVRAM for Cloud I/O
conference, October 2011
- Kannan, Sudarsun; Milojicic, Dejan; Talwar, Vanish
- 2011 6th Open Cirrus Summit (OCS), 2011 Sixth Open Cirrus Summit
A comprehensive study of energy efficiency and performance of flash-based SSD
journal, April 2011
- Park, Seonyeong; Kim, Youngjae; Urgaonkar, Bhuvan
- Journal of Systems Architecture, Vol. 57, Issue 4, p. 354-365
Making a case for distributed file systems at Exascale
conference, January 2011
- Raicu, Ioan; Foster, Ian T.; Beckman, Pete
- Proceedings of the third international workshop on Large-scale system and application performance - LSAP '11
Jitter-free co-processing on a prototype exascale storage stack
conference, April 2012
- Bent, John; Faibish, Sorin; Ahrens, Jim
- 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
Verifying Scientific Simulations via Comparative and Quantitative Visualization
journal, November 2010
- Ahrens, James; Heitmann, Katrin; Petersen, Mark
- IEEE Computer Graphics and Applications, Vol. 30, Issue 6
Design issues for a shingled write disk system
conference, May 2010
- Amer, Ahmed; Long, Darrell D. E.; Miller, Ethan L.
- 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Managing Variability in the IO Performance of Petascale Storage Systems
conference, November 2010
- Lofstead, Jay; Zheng, Fang; Liu, Qing
- 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
conference, January 2008
- Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
- Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
conference, November 2010
- Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn
- 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
GIGA+: scalable directories for shared file systems
conference, January 2007
- Patil, Swapnil V.; Gibson, Garth A.; Lang, Sam
- Proceedings of the 2nd international workshop on Petascale data storage held in conjunction with Supercomputing '07 - PDSW '07
Scalable parallel building blocks for custom data analysis
conference, October 2011
- Peterka, Tom; Ross, Robert; Gyulassy, Attila
- 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV)
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q
conference, January 2003
- Petrini, Fabrizio; Kerbyson, Darren J.; Pakin, Scott
- Proceedings of the 2003 ACM/IEEE conference on Supercomputing - SC '03
Visualization by Proxy: A Novel Framework for Deferred Interaction with Volume Data
journal, November 2010
- Tikhonova, A.; Correa, C. D.
- IEEE Transactions on Visualization and Computer Graphics, Vol. 16, Issue 6
Toward simulation-time data analysis and I/O acceleration on leadership-class systems
conference, October 2011
- Vishwanath, Venkatram; Hereld, Mark; Papka, Michael E.
- 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV)
In-situ Sampling of a Large-Scale Particle Simulation for Interactive Visualization and Analysis
journal, June 2011
- Woodring, J.; Ahrens, J.; Figg, J.
- Computer Graphics Forum, Vol. 30, Issue 3
Remote Large Data Visualization in the ParaView Framework
null, January 2006
- Cedilnik, Andy; Geveci, Berk; Moreland, Kenneth
- The Eurographics Association
On the role of burst buffers in leadership-class storage systems
conference, April 2012
- Liu, Ning; Cope, Jason; Carns, Philip
- 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
DataStager: scalable data staging services for petascale applications
journal, June 2010
- Abbasi, Hasan; Wolf, Matthew; Eisenhauer, Greg
- Cluster Computing, Vol. 13, Issue 3
Modeling a Leadership-Scale Storage System
book, January 2012
- Liu, Ning; Carothers, Christopher; Cope, Jason
- Parallel Processing and Applied Mathematics
Storage challenges at Los Alamos National Lab
conference, April 2012
- Bent, John; Grider, Gary; Kettering, Brett
Scalable I/O forwarding framework for high-performance computing systems
conference, August 2009
- Ali, Nawab; Carns, Philip; Iskra, Kamil
- 2009 IEEE International Conference on Cluster Computing and Workshops
Pageserver: High-Performance SSD-Based Checkpointing of Transactional Distributed Memory
conference, March 2010
- Gerhold, Steffen; Kaemmer, Nico; Weggerle, Alexander
- 2010 Second International Conference on Computer Engineering and Applications
Enhancing Checkpoint Performance with Staging IO and SSD
conference, May 2010
- Ouyang, Xiangyong; Marcarelli, Sonya; Panda, Dhabaleswar K.
- 2010 International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI)