Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Active Flash: Out-of-core Data Analytics on Flash Storage

Conference ·
OSTI ID:1041431

Next generation science will increasingly come to rely on the ability to perform efficient, on-the-fly analytics of data generated by high-performance computing (HPC) simulations, modeling complex physical phenomena. Scientific computing workflows are stymied by the traditional chaining of simulation and data analysis, creating multiple rounds of redundant reads and writes to the storage system, which grows in cost with the ever-increasing gap between compute and storage speeds in HPC clusters. Recent HPC acquisitions have introduced compute node-local flash storage as a means to alleviate this I/O bottleneck. We propose a novel approach, Active Flash, to expedite data analysis pipelines by migrating to the location of the data, the flash device itself. We argue that Active Flash has the potential to enable true out-of-core data analytics by freeing up both the compute core and the associated main memory. By performing analysis locally, dependence on limited bandwidth to a central storage system is reduced, while allowing this analysis to proceed in parallel with the main application. In addition, offloading work from the host to the more power-efficient controller reduces peak system power usage, which is already in the megawatt range and poses a major barrier to HPC system scalability. We propose an architecture for Active Flash, explore energy and performance trade-offs in moving computation from host to storage, demonstrate the ability of appropriate embedded controllers to perform data analysis and reduction tasks at speeds sufficient for this application, and present a simulation study of Active Flash scheduling policies. These results show the viability of the Active Flash model, and its capability to potentially have a transformative impact on scientific data analysis.

Research Organization:
Oak Ridge National Laboratory (ORNL)
Sponsoring Organization:
SC USDOE - Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1041431
Country of Publication:
United States
Language:
English

Similar Records

Active Flash: Performance-Energy Tradeoffs for Out-of-Core Processing on Non-Volatile Memory Devices
Conference · Sat Dec 31 23:00:00 EST 2011 · OSTI ID:1037136

An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays
Journal Article · Tue Aug 14 00:00:00 EDT 2018 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1493140

Performance Debugging and Tuning of Flash-X with Data Analysis Tools
Conference · Tue Nov 01 00:00:00 EDT 2022 · OSTI ID:2000270