skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: AnalyzeThis: an analysis workflow-aware storage system

Conference · · Proceedings of SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis

The need for novel data analysis is urgent in the face of a data deluge from modern applications. Traditional approaches to data analysis incur significant data movement costs, moving data back and forth between the storage system and the processor. Emerging Active Flash devices en-able processing on the flash, where the data already resides. An array of such Active Flash devices allows us to revisit how analysis workflows interact with storage systems. By seamlessly blending together the flash storage and data analysis, we create an analysis workflow-aware storage system, AnalyzeThis. Our guiding principle is that analysis-awareness be deeply ingrained in each and every layer of the storage, elevating data analyses as first-class citizens, and transforming AnalyzeThis into a potent analytics-aware appliance. We implement the AnalyzeThis storage system atop an emulation platform of the Active Flash array. Our results indicate that AnalyzeThis is viable, expediting workflow execution and minimizing data movement.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725; AC02-05CH11231
OSTI ID:
1567399
Journal Information:
Proceedings of SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis, Conference: International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, Texas, November 15-20, 2015
Country of Publication:
United States
Language:
English

References (16)

Enabling active storage on parallel I/O software stacks conference May 2010
A case for intelligent disks (IDISKs) journal September 1998
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533
conference January 2008
Active disk meets flash: a case for intelligent SSDs conference January 2013
Understanding and Improving Computational Science Storage Access through Continuous Characterization journal October 2011
Sipros/ProRata: a versatile informatics system for quantitative community proteomics journal June 2013
An active storage framework for object storage devices conference April 2012
Hyracks: A flexible and extensible foundation for data-intensive computing
  • Borkar, Vinayak; Carey, Michael; Grover, Raman
  • 2011 IEEE International Conference on Data Engineering (ICDE 2011), 2011 IEEE 27th International Conference on Data Engineering https://doi.org/10.1109/ICDE.2011.5767921
conference April 2011
Job scheduling under the Portable Batch System book January 1995
Nephele/PACTs: a programming model and execution framework for web-scale analytical processing conference January 2010
Parallel netCDF: A High-Performance Scientific I/O Interface conference January 2003
Efficient management of idleness in storage systems journal June 2009
Active Flash: Out-of-core data analytics on flash storage conference April 2012
Workload characterization of a leadership class storage cluster conference November 2010
Evaluation of active storage strategies for the lustre parallel file system conference January 2007
Enabling cost-effective data processing with smart SSD conference May 2013

Similar Records

An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays
Journal Article · Tue Aug 14 00:00:00 EDT 2018 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1567399

AnalyzeThis: An Analysis Workflow-Aware Storage System
Conference · Thu Jan 01 00:00:00 EST 2015 · OSTI ID:1567399

An Integrated Indexing and Search Service for Distributed File Systems
Journal Article · Mon Apr 27 00:00:00 EDT 2020 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1567399

Related Subjects