Analyzing inference workloads for spatiotemporal modeling

Jain, Milan; Agostini, Nicolas Bohm; Ghosh, Sayan; Tumeo, Antonino

doi:10.1016/j.future.2024.107513

Analyzing inference workloads for spatiotemporal modeling

Journal Article · Tue Sep 17 00:00:00 EDT 2024 · Future Generations Computer Systems

DOI:https://doi.org/10.1016/j.future.2024.107513· OSTI ID:2513464

^[1]; ^[1]; ^[1]; Tumeo, Antonino ^[1]

Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Ensuring power grid resiliency, forecasting climate conditions, and optimization of transportation infrastructure are some of the many application areas where data is collected in both space and time. Spatiotemporal modeling is about modeling those patterns for forecasting future trends and carrying out critical decision-making by leveraging machine learning/deep learning. Once trained offline, field deployment of trained models for near real-time inference could be challenging because performance can vary significantly depending on the environment, available compute resources and tolerance to ambiguity in results. Users deploying spatiotemporal models for solving complex problems can benefit from analytical studies considering a plethora of system adaptations to understand the associated performance-quality trade-offs. To facilitate the co-design of next-generation hardware architectures for field deployment of trained models, it is critical to characterize the workloads of these deep learning (DL) applications during inference and assess their computational patterns at different levels of the execution stack. In this paper, we develop several variants of deep learning applications that use spatiotemporal data from dynamical systems. We study the associated computational patterns for inference workloads at different levels, considering relevant models (Long short-term Memory, Convolutional Neural Network and Spatio-Temporal Graph Convolution Network), DL frameworks (Tensorflow and PyTorch), precision (FP16, FP32, AMP, INT16 and INT8), inference runtime (ONNX and AI Template), post-training quantization (TensorRT) and platforms (Nvidia DGX A100 and Sambanova SN10 RDU). Overall, our findings indicate that although there is potential in mixed-precision models and post-training quantization for spatiotemporal modeling, extracting efficiency from contemporary GPU systems might be challenging. Instead, co-designing custom accelerators by leveraging optimized High Level Synthesis frameworks (such as SODA High-Level Synthesizer for customized FPGA/ASIC targets) can make workload-specific adjustments to enhance the efficiency.

View Accepted Manuscript (DOE)

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

Grant/Contract Number:: AC05-76RL01830

OSTI ID:: 2513464

Report Number(s):: PNNL-SA-187612

Journal Information:: Future Generations Computer Systems, Journal Name: Future Generations Computer Systems Vol. 163; ISSN 0167-739X

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

References (25)

The application of small unmanned aerial systems for precision agriculture: a review Zhang, Chunhua; Kovacs, John M. Precision Agriculture, Vol. 13, Issue 6 https://doi.org/10.1007/s11119-012-9274-5	journal	July 2012
Performance evaluation of edge-computing platforms for the prediction of low temperatures in agriculture using deep learning Guillén, Miguel A.; Llanes, Antonio; Imbernón, Baldomero The Journal of Supercomputing, Vol. 77, Issue 1 https://doi.org/10.1007/s11227-020-03288-w	journal	April 2020
Graph neural networks: A review of methods and applications Zhou, Jie; Cui, Ganqu; Hu, Shengding AI Open, Vol. 1 https://doi.org/10.1016/j.aiopen.2021.01.001	journal	January 2020
CNNpred: CNN-based stock market prediction using a diverse set of variables Hoseinzade, Ehsan; Haratizadeh, Saman Expert Systems with Applications, Vol. 129 https://doi.org/10.1016/j.eswa.2019.03.029	journal	September 2019
Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing Gawande, Nitin A.; Daily, Jeff A.; Siegel, Charles Future Generation Computer Systems, Vol. 108 https://doi.org/10.1016/j.future.2018.04.073	journal	July 2020
Considerations in using OpenCL on GPUs and FPGAs for throughput-oriented genomics workloads Cadenelli, Nicola; Jaks̆ić, Zoran; Polo, Jordà Future Generation Computer Systems, Vol. 94 https://doi.org/10.1016/j.future.2018.11.028	journal	May 2019
A highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL Jakšić, Zoran; Cadenelli, Nicola; Prats, David Buchaca Future Generation Computer Systems, Vol. 104 https://doi.org/10.1016/j.future.2019.10.025	journal	March 2020
Inference-aware convolutional neural network pruning Choudhary, Tejalal; Mishra, Vipul; Goswami, Anurag Future Generation Computer Systems, Vol. 135 https://doi.org/10.1016/j.future.2022.04.031	journal	October 2022
An adaptive DNN inference acceleration framework with end–edge–cloud collaborative computing Liu, Guozhi; Dai, Fei; Xu, Xiaolong Future Generation Computer Systems, Vol. 140 https://doi.org/10.1016/j.future.2022.10.033	journal	March 2023
Multiple convolutional neural networks for multivariate time series prediction Wang, Kang; Li, Kenli; Zhou, Liqian Neurocomputing, Vol. 360 https://doi.org/10.1016/j.neucom.2019.05.023	journal	September 2019
Neurons with graded response have collective computational properties like those of two-state neurons. Hopfield, J. J. Proceedings of the National Academy of Sciences, Vol. 81, Issue 10 https://doi.org/10.1073/pnas.81.10.3088	journal	May 1984
A Review of Deep Learning Models for Time Series Prediction Han, Zhongyang; Zhao, Jun; Leung, Henry IEEE Sensors Journal, Vol. 21, Issue 6 https://doi.org/10.1109/JSEN.2019.2923982	journal	March 2021
Compute-in-Memory Chips for Deep Learning: Recent Trends and Prospects Yu, Shimeng; Jiang, Hongwu; Huang, Shanshi IEEE Circuits and Systems Magazine, Vol. 21, Issue 3 https://doi.org/10.1109/MCAS.2021.3092533	journal	October 2021
Floating Vector Processor for Power System Simulation Takatoo, M.; Abe, S.; Bando, T. IEEE Transactions on Power Apparatus and Systems, Vol. PAS-104, Issue 12 https://doi.org/10.1109/TPAS.1985.318863	journal	December 1985
Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC Abdulah, Sameh; Cao, Qinglei; Pei, Yu IEEE Transactions on Parallel and Distributed Systems, Vol. 33, Issue 4 https://doi.org/10.1109/TPDS.2021.3084071	journal	April 2022
Poster Abstract Jain, Milan; Singh, Amarjeet Proceedings of the 2015 ACM Sixth International Conference on Future Energy Systems https://doi.org/10.1145/2768510.2770953	conference	July 2015
Chips for artificial intelligence Monroe, Don Communications of the ACM, Vol. 61, Issue 4 https://doi.org/10.1145/3185523	journal	March 2018
Modeling Extreme Events in Time Series Prediction Ding, Daizong; Zhang, Mi; Pan, Xudong Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3292500.3330896	conference	July 2019
DeepSpeed Rasley, Jeff; Rajbhandari, Samyam; Ruwase, Olatunji Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3394486.3406703	conference	August 2020
Workload characterization of a time-series prediction system for spatio-temporal data Jain, Milan; Ghosh, Sayan; Nandanoori, Sai Pushpak Proceedings of the 19th ACM International Conference on Computing Frontiers https://doi.org/10.1145/3528416.3530242	conference	May 2022
FourCastNet: Accelerating Global High-Resolution Weather Forecasting Using Adaptive Fourier Neural Operators Kurth, Thorsten; Subramanian, Shashank; Harrington, Peter Proceedings of the Platform for Advanced Scientific Computing Conference https://doi.org/10.1145/3592979.3593412	conference	June 2023
Towards AI-Based Traffic Counting System with Edge Computing Dinh, Duc-Liem; Nguyen, Hong-Nam; Thai, Huy-Tan Journal of Advanced Transportation, Vol. 2021 https://doi.org/10.1155/2021/5551976	journal	June 2021
Long Short-Term Memory Hochreiter, Sepp; Schmidhuber, Jürgen Neural Computation, Vol. 9, Issue 8 https://doi.org/10.1162/neco.1997.9.8.1735	journal	November 1997
Optimising Deep Learning at the Edge for Accurate Hourly Air Quality Prediction Wardana, I. Nyoman Kusuma; Gardner, Julian W.; Fahmy, Suhaib A. Sensors, Vol. 21, Issue 4 https://doi.org/10.3390/s21041064	journal	February 2021
Implementation of EDGE Computing Platform in Feeder Terminal Unit for Smart Applications in Distribution Networks with Distributed Renewable Energies Chih, Hsin-Ching; Lin, Wei-Chen; Huang, Wei-Tzer Sustainability, Vol. 14, Issue 20 https://doi.org/10.3390/su142013042	journal	October 2022

Similar Records

Analyzing Deep Learning Model Inferences for Image Classification using OpenVINO

Conference · Tue Dec 31 23:00:00 EST 2019 · OSTI ID:1804060

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

Conference · Mon Jul 03 00:00:00 EDT 2017 · OSTI ID:1373860

Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

Journal Article · Fri May 04 20:00:00 EDT 2018 · Future Generations Computer Systems · OSTI ID:1617450

Related Subjects

97 MATHEMATICS AND COMPUTING
GPU
High-level synthesis
Inference
Performance analysis and profiling
Spatiotemporal modeling

Analyzing inference workloads for spatiotemporal modeling

Citation Formats

References (25)

Similar Records

Related Subjects