Parallel Index and Query for Large Scale Data Analysis

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E Wes; Ryne, Rob D; Shoshani, Arie

Title: Parallel Index and Query for Large Scale Data Analysis

Conference · Mon Jul 18 00:00:00 EDT 2011

OSTI ID:1056552

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E Wes; Ryne, Rob D; Shoshani, Arie

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.

View Conference

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: Computational Research Division

DOE Contract Number:: DE-AC02-05CH11231

OSTI ID:: 1056552

Report Number(s):: LBNL-5317E

Resource Relation:: Conference: SC11, Seattle, WA, USA, November 12 - 18, 2011

Country of Publication:: United States

Language:: English

Similar Records

FastQuery: A Parallel Indexing System for Scientific Data

Conference · Fri Jul 29 00:00:00 EDT 2011 · OSTI ID:1056552

Chou, Jerry; Wu, Kesheng; Prabhat,

Design of FastQuery: How to Generalize Indexing and Querying System for Scientific Data

Technical Report · Mon Apr 18 00:00:00 EDT 2011 · OSTI ID:1056552

Wu, Jerry; Wu, Kesheng

MOSIQS: Persistent Memory Object Storage With Metadata Indexing and Querying for Scientific Computing

Journal Article · Tue Jun 08 00:00:00 EDT 2021 · IEEE Access · OSTI ID:1056552

Khan, Awais; Sim, Hyogi; Vazhkudai, Sudharshan S.; +1 more

Related Subjects

97 MATHEMATICS AND COMPUTING

Title: Parallel Index and Query for Large Scale Data Analysis

Citation Formats

Similar Records

Related Subjects