Large language model evaluation for high–performance computing software development

Godoy, William F.; Valero‐Lara, Pedro; Teranishi, Keita; Balaprakash, Prasanna; Vetter, Jeffrey S.

doi:10.1002/cpe.8269

Large language model evaluation for high–performance computing software development

Journal Article · Wed Sep 04 00:00:00 EDT 2024 · Concurrency and Computation. Practice and Experience

DOI:https://doi.org/10.1002/cpe.8269· OSTI ID:2474767

^[1]; ^[1]; Teranishi, Keita ^[1]; ^[1]; ^[1]

Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

We apply AI-assisted large language model (LLM) capabilities of GPT-3 targeting high-performance computing (HPC) kernels for (i) code generation, and (ii) auto-parallelization of serial code in C ++, Fortran, Python and Julia. Our scope includes the following fundamental numerical kernels: AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG, and language/programming models: (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numpy, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). Kernel implementations are generated using GitHub Copilot capabilities powered by the GPT-based OpenAI Codex available in Visual Studio Code given simple + + prompt variants. To quantify and compare the generated results, we propose a proficiency metric around the initial 10 suggestions given for each prompt. For auto-parallelization, we use ChatGPT interactively giving simple prompts as in a dialogue with another human including simple “prompt engineering” follow ups. Results suggest that correct outputs for C++ correlate with the adoption and maturity of programming models. For example, OpenMP and CUDA score really high, whereas HIP is still lacking. We found that prompts from either a targeted language such as Fortran or the more general-purpose Python can benefit from adding language keywords, while Julia prompts perform acceptably well for its Threads and CUDA.jl programming models. Finally, we expect to provide an initial quantifiable point of reference for code generation in each programming model using a state-of-the-art LLM. Overall, understanding the convergence of LLMs, AI, and HPC is crucial due to its rapidly evolving nature and how it is redefining human-computer interactions.

View Accepted Manuscript (DOE)

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC)

Grant/Contract Number:: AC05-00OR22725

OSTI ID:: 2474767

Journal Information:: Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 26 Vol. 36; ISSN 1532-0626

Publisher:: WileyCopyright Statement

Country of Publication:: United States

Language:: English

References (34)

LM4HPC: Towards Effective Language Model Application in??High-Performance Computing Chen, Le; Lin, Pei-Hung; Vanderbruggen, Tristan OpenMP: Advanced Task-Based, Device and Compiler Programming https://doi.org/10.1007/978-3-031-40744-4_2	book	January 2023
GPT-3: Its Nature, Scope, Limits, and Consequences Floridi, Luciano; Chiriatti, Massimo Minds and Machines, Vol. 30, Issue 4 https://doi.org/10.1007/s11023-020-09548-1	journal	November 2020
LLM4VV: Developing LLM-driven testsuite for compiler validation Munley, Christian; Jarmusch, Aaron; Chandrasekaran, Sunita Future Generation Computer Systems, Vol. 160 https://doi.org/10.1016/j.future.2024.05.034	journal	November 2024
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns Carter Edwards, H.; Trott, Christian R.; Sunderland, Daniel Journal of Parallel and Distributed Computing, Vol. 74, Issue 12 https://doi.org/10.1016/j.jpdc.2014.07.003	journal	December 2014
PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation Klöckner, Andreas; Pinto, Nicolas; Lee, Yunsup Parallel Computing, Vol. 38, Issue 3 https://doi.org/10.1016/j.parco.2011.09.001	journal	March 2012
Code Generation Using Machine Learning: A Systematic Review Dehaerne, Enrique; Dey, Bappaditya; Halder, Sandip IEEE Access, Vol. 10 https://doi.org/10.1109/ACCESS.2022.3196347	journal	January 2022
Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++ Lei, Bin; Ding, Caiwen; Chen, Le 2023 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC58863.2023.10363534	conference	September 2023
Experimental Multi-threading Support for the Julia Programming Language Knopp, Tobias 2014 First Workshop for High Performance Technical Computing in Dynamic Languages https://doi.org/10.1109/HPTCDL.2014.11	conference	November 2014
Analysis of the popularity of programming languages in open source software communities Lu, Dongdong; Wu, Jie; Sheng, Yongxiang 2020 International Conference on Big Data and Social Sciences (ICBDSS) https://doi.org/10.1109/ICBDSS51270.2020.00033	conference	August 2020
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes Godoy, William F.; Valero-Lara, Pedro; Dettling, T. Elise 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW59300.2023.00068	conference	May 2023
The NumPy Array: A Structure for Efficient Numerical Computation van der Walt, Stéfan; Colbert, S. Chris; Varoquaux, Gaël Computing in Science & Engineering, Vol. 13, Issue 2 https://doi.org/10.1109/MCSE.2011.37	journal	March 2011
Exascale Computing in the United States Kothe, Douglas; Lee, Stephen; Qualters, Irene Computing in Science & Engineering, Vol. 21, Issue 1 https://doi.org/10.1109/MCSE.2018.2875366	journal	January 2019
Fortran Backus, J. W.; Heising, W. P. IEEE Transactions on Electronic Computers, Vol. EC-13, Issue 4 https://doi.org/10.1109/PGEC.1964.263818	journal	August 1964
Effective Extensible Programming: Unleashing Julia on GPUs Besard, Tim; Foket, Christophe; De Sutter, Bjorn IEEE Transactions on Parallel and Distributed Systems, Vol. 30, Issue 4 https://doi.org/10.1109/TPDS.2018.2872064	journal	April 2019
Advances in natural language processing Hirschberg, J.; Manning, C. D. Science, Vol. 349, Issue 6245 https://doi.org/10.1126/science.aaa8685	journal	July 2015
Julia: A Fresh Approach to Numerical Computing Bezanson, Jeff; Edelman, Alan; Karpinski, Stefan SIAM Review, Vol. 59, Issue 1 https://doi.org/10.1137/141000671	journal	January 2017
Numba: a LLVM-based Python JIT compiler Lam, Siu Kwan; Pitrou, Antoine; Seibert, Stanley Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC - LLVM '15 https://doi.org/10.1145/2833157.2833162	conference	January 2015
Generalizing from a Few Examples Wang, Yaqing; Yao, Quanming; Kwok, James T. ACM Computing Surveys, Vol. 53, Issue 3 https://doi.org/10.1145/3386252	journal	June 2020
DeepSpeed Rasley, Jeff; Rajbhandari, Samyam; Ruwase, Olatunji Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3394486.3406703	conference	August 2020
Psb2 Helmuth, Thomas; Kelly, Peter Proceedings of the Genetic and Evolutionary Computation Conference https://doi.org/10.1145/3449639.3459285	conference	June 2021
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models Vaithilingam, Priyan; Zhang, Tianyi; Glassman, Elena L. CHI Conference on Human Factors in Computing Systems Extended Abstracts https://doi.org/10.1145/3491101.3519665	conference	April 2022
Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models Sarsa, Sami; Denny, Paul; Hellas, Arto Proceedings of the 2022 ACM Conference on International Computing Education Research - Volume 1 https://doi.org/10.1145/3501385.3543957	conference	August 2022
Is GitHub copilot a substitute for human pair-programming? Imai, Saki Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings https://doi.org/10.1145/3510454.3522684	conference	May 2022
The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming Finnie-Ansley, James; Denny, Paul; Becker, Brett A. Australasian Computing Education Conference https://doi.org/10.1145/3511861.3511863	conference	February 2022
Choose your programming copilot Sobania, Dominik; Briesch, Martin; Rothlauf, Franz Proceedings of the Genetic and Evolutionary Computation Conference https://doi.org/10.1145/3512290.3528700	conference	July 2022
Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language Denny, Paul; Kumar, Viraj; Giacaman, Nasser Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 https://doi.org/10.1145/3545945.3569823	conference	March 2023
Using GitHub Copilot to Solve Simple Programming Problems Wermelinger, Michel Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 https://doi.org/10.1145/3545945.3569830	conference	March 2023
HPC Forecast: Cloudy and Uncertain Reed, Daniel; Gannon, Dennis; Dongarra, Jack Communications of the ACM, Vol. 66, Issue 2, 82-90 https://doi.org/10.1145/3552309	journal	January 2023
Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation Godoy, William; Valero-Lara, Pedro; Teranishi, Keita Proceedings of the 52nd International Conference on Parallel Processing Workshops https://doi.org/10.1145/3605731.3605886	conference	August 2023
HPC-GPT: Integrating Large Language Model for High-Performance Computing Ding, Xianzhong; Chen, Le; Emani, Murali Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis https://doi.org/10.1145/3624062.3624172	conference	November 2023
Julia as a unifying end-to-end workflow language on the Frontier exascale system Godoy, William F.; Valero-Lara, Pedro; Anderson, Caira Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis https://doi.org/10.1145/3624062.3624278	conference	November 2023
The International Exascale Software Project roadmap Dongarra, Jack; Beckman, Pete; Moore, Terry The International Journal of High Performance Computing Applications, Vol. 25, Issue 1 https://doi.org/10.1177/1094342010391989	journal	January 2011
Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity Vetter, Jeffrey S.; Brightwell, Ron; Gokhale, Maya https://doi.org/10.2172/1473756	report	December 2018
HPC-Coder: Modeling Parallel Programs using Large Language Models Nichols, Daniel; Marathe, Aniruddha; Menon, Harshitha ISC High Performance 2024 Research Paper Proceedings (39th International Conference) https://doi.org/10.23919/ISC.2024.10528929	conference	May 2024

Similar Records

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

Conference · Tue Aug 01 00:00:00 EDT 2023 · OSTI ID:2000371

Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

Conference · Mon May 01 00:00:00 EDT 2023 · OSTI ID:1994693

KokkACC: Enhancing Kokkos with OpenACC

Conference · Tue Nov 01 00:00:00 EDT 2022 · OSTI ID:2000279

Related Subjects

97 MATHEMATICS AND COMPUTING
GPT
auto-parallelization
code generation
high-performance computing
large language model
programming models

Large language model evaluation for high–performance computing software development

Citation Formats

References (34)

Similar Records

Related Subjects