skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Active Flash: Out-of-core Data Analytics on Flash Storage

Conference ·
OSTI ID:1041431

Next generation science will increasingly come to rely on the ability to perform efficient, on-the-fly analytics of data generated by high-performance computing (HPC) simulations, modeling complex physical phenomena. Scientific computing workflows are stymied by the traditional chaining of simulation and data analysis, creating multiple rounds of redundant reads and writes to the storage system, which grows in cost with the ever-increasing gap between compute and storage speeds in HPC clusters. Recent HPC acquisitions have introduced compute node-local flash storage as a means to alleviate this I/O bottleneck. We propose a novel approach, Active Flash, to expedite data analysis pipelines by migrating to the location of the data, the flash device itself. We argue that Active Flash has the potential to enable true out-of-core data analytics by freeing up both the compute core and the associated main memory. By performing analysis locally, dependence on limited bandwidth to a central storage system is reduced, while allowing this analysis to proceed in parallel with the main application. In addition, offloading work from the host to the more power-efficient controller reduces peak system power usage, which is already in the megawatt range and poses a major barrier to HPC system scalability. We propose an architecture for Active Flash, explore energy and performance trade-offs in moving computation from host to storage, demonstrate the ability of appropriate embedded controllers to perform data analysis and reduction tasks at speeds sufficient for this application, and present a simulation study of Active Flash scheduling policies. These results show the viability of the Active Flash model, and its capability to potentially have a transformative impact on scientific data analysis.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
DE-AC05-00OR22725
OSTI ID:
1041431
Resource Relation:
Conference: Proceedings of the 28th IEEE Conference on Mass Storage Systems and Technologies (MSST 2012), Monterey, CA, USA, 20120416, 20120420
Country of Publication:
United States
Language:
English