Scaling SQL to the Supercomputer for Interactive Analysis of Simulation Data
- ORNL
- Voltron Data
AI and simulation workloads consume and generate large amounts of data that need to be searched, transformed and merged with other data. With the goal of treating data as a first-class citizen inside a traditionally compute-centric HPC environment, we explore how the use of accelerators and high-speed interconnects can speed up tasks which otherwise constitute bottlenecks in computational discovery workflows. BlazingSQL is SQL engine that runs natively on NVIDIA GPUs and supports internode communication for fast analytics on terabyte-scale tabular data sets. We show how a fast interconnect improves query performance if leveraged through the Unified Communication X (UCX) middleware. We envision that future computing platforms will integrate accelerated database query capabilities for immediate and interactive analysis of large simulation data.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1856710
- Resource Relation:
- Journal Volume: 1512; Conference: Smoky Mountains Computational Sciences and Engineering Conference (SMC) - Kingsport, Tennessee, United States of America - 10/18/2021 12:00:00 PM-10/20/2021 12:00:00 PM
- Country of Publication:
- United States
- Language:
- English
Similar Records
A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect