Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Analyzing inference workloads for spatiotemporal modeling

Journal Article · · Future Generations Computer Systems
Ensuring power grid resiliency, forecasting climate conditions, and optimization of transportation infrastructure are some of the many application areas where data is collected in both space and time. Spatiotemporal modeling is about modeling those patterns for forecasting future trends and carrying out critical decision-making by leveraging machine learning/deep learning. Once trained offline, field deployment of trained models for near real-time inference could be challenging because performance can vary significantly depending on the environment, available compute resources and tolerance to ambiguity in results. Users deploying spatiotemporal models for solving complex problems can benefit from analytical studies considering a plethora of system adaptations to understand the associated performance-quality trade-offs. To facilitate the co-design of next-generation hardware architectures for field deployment of trained models, it is critical to characterize the workloads of these deep learning (DL) applications during inference and assess their computational patterns at different levels of the execution stack. In this paper, we develop several variants of deep learning applications that use spatiotemporal data from dynamical systems. We study the associated computational patterns for inference workloads at different levels, considering relevant models (Long short-term Memory, Convolutional Neural Network and Spatio-Temporal Graph Convolution Network), DL frameworks (Tensorflow and PyTorch), precision (FP16, FP32, AMP, INT16 and INT8), inference runtime (ONNX and AI Template), post-training quantization (TensorRT) and platforms (Nvidia DGX A100 and Sambanova SN10 RDU). Overall, our findings indicate that although there is potential in mixed-precision models and post-training quantization for spatiotemporal modeling, extracting efficiency from contemporary GPU systems might be challenging. Instead, co-designing custom accelerators by leveraging optimized High Level Synthesis frameworks (such as SODA High-Level Synthesizer for customized FPGA/ASIC targets) can make workload-specific adjustments to enhance the efficiency.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-76RL01830
OSTI ID:
2513464
Report Number(s):
PNNL-SA-187612
Journal Information:
Future Generations Computer Systems, Journal Name: Future Generations Computer Systems Vol. 163; ISSN 0167-739X
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (25)

The application of small unmanned aerial systems for precision agriculture: a review journal July 2012
Performance evaluation of edge-computing platforms for the prediction of low temperatures in agriculture using deep learning journal April 2020
Graph neural networks: A review of methods and applications journal January 2020
CNNpred: CNN-based stock market prediction using a diverse set of variables journal September 2019
Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing journal July 2020
Considerations in using OpenCL on GPUs and FPGAs for throughput-oriented genomics workloads journal May 2019
A highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL journal March 2020
Inference-aware convolutional neural network pruning journal October 2022
An adaptive DNN inference acceleration framework with end–edge–cloud collaborative computing journal March 2023
Multiple convolutional neural networks for multivariate time series prediction journal September 2019
Neurons with graded response have collective computational properties like those of two-state neurons. journal May 1984
A Review of Deep Learning Models for Time Series Prediction journal March 2021
Compute-in-Memory Chips for Deep Learning: Recent Trends and Prospects journal October 2021
Floating Vector Processor for Power System Simulation journal December 1985
Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC journal April 2022
Poster Abstract conference July 2015
Chips for artificial intelligence journal March 2018
Modeling Extreme Events in Time Series Prediction conference July 2019
DeepSpeed conference August 2020
Workload characterization of a time-series prediction system for spatio-temporal data conference May 2022
FourCastNet: Accelerating Global High-Resolution Weather Forecasting Using Adaptive Fourier Neural Operators conference June 2023
Towards AI-Based Traffic Counting System with Edge Computing journal June 2021
Long Short-Term Memory journal November 1997
Optimising Deep Learning at the Edge for Accurate Hourly Air Quality Prediction journal February 2021
Implementation of EDGE Computing Platform in Feeder Terminal Unit for Smart Applications in Distribution Networks with Distributed Renewable Energies journal October 2022

Similar Records

Analyzing Deep Learning Model Inferences for Image Classification using OpenVINO
Conference · Tue Dec 31 23:00:00 EST 2019 · OSTI ID:1804060

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
Conference · Mon Jul 03 00:00:00 EDT 2017 · OSTI ID:1373860

Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
Journal Article · Fri May 04 20:00:00 EDT 2018 · Future Generations Computer Systems · OSTI ID:1617450