K-Spin Hamiltonian for Quantum-Resolvable Markov Decision Processes
The Markov decision process is the mathematical formalization underlying the modern field of reinforcement learning when transition and reward functions are unknown. We derive a pseudo-Boolean cost function that is equivalent to a K-spin Hamiltonian representation of the discrete, finite, discounted Markov decision process with infinite horizon. This K-spin Hamiltonian furnishes a starting point from which to solve for an optimal policy using heuristic quantum algorithms such as adiabatic quantum annealing and the quantum approximate optimization algorithm on near-term quantum hardware. In arguing that the variational minimization of our Hamiltonian is approximately equivalent to the Bellman optimality condition for a prevalent class of environments we establish an interesting analogy with classical field theory. Along with proof-of-concept calculations to corroborate our formulation by simulated and quantum annealing against classical Q-Learning, we analyze the scaling of physical resources required to solve our Hamiltonian on quantum hardware.
- Research Organization:
- National Renewable Energy Laboratory (NREL), Golden, CO (United States)
- Sponsoring Organization:
- USDOE National Renewable Energy Laboratory (NREL), Laboratory Directed Research and Development (LDRD) Program
- DOE Contract Number:
- AC36-08GO28308
- OSTI ID:
- 1885979
- Report Number(s):
- NREL/JA-2C00-83527; MainId:84300; UUID:a9fde509-222a-4c32-86e8-e8c664189bdf; MainAdminID:65284
- Journal Information:
- Quantum Machine Intelligence, Vol. 2
- Country of Publication:
- United States
- Language:
- English
Similar Records
Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes
Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity