Towards provably efficient quantum algorithms for large-scale machine-learning models
- University of Chicago, IL (United States); Chicago Quantum Exchange, IL (United States); qBraid Co., Chicago, IL (United States); SeQure, Chicago, IL (United States)
- University of Chicago, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
- University of California, Berkeley, CA (United States); Massachusetts Institute of Technology (MIT), Cambridge, MA (United States)
- University of Chicago, IL (United States)
- Brandeis University, Waltham, MA (United States)
- University of Chicago, IL (United States); Chicago Quantum Exchange, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
- Free University, Berlin (Germany)
- University of Chicago, IL (United States); Chicago Quantum Exchange, IL (United States)
Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as $$\mathcal{O}$$(T2 x polylog($$n$$)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.
- Research Organization:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- David and Lucile Packard Foundation; National Science Foundation (NSF); Simons Foundation; US Air Force Office of Scientific Research (AFOSR); US Army Research Office (ARO); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
- Grant/Contract Number:
- AC02-06CH11357
- OSTI ID:
- 2469541
- Journal Information:
- Nature Communications, Journal Name: Nature Communications Journal Issue: 1 Vol. 15; ISSN 2041-1723
- Publisher:
- Nature Publishing GroupCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Stochastic Spectral Descent for Discrete Graphical Models
A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks