Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning
In multi-timescale multi-agent reinforcement learning (MARL), agents interact across different timescales. In general, policies for time-dependent behaviors, such as those induced by multiple timescales, are non-stationary. Learning non-stationary policies is challenging and typically requires sophisticated or inefficient algorithms. Motivated by the prevalence of this control problem in real-world complex systems, we introduce a simple framework for learning non-stationary policies for multi-timescale MARL. Our approach uses available information about agent timescales to define and learn periodic multi-agent policies. In detail, we theoretically demonstrate that the effects of non-stationarity introduced by multiple timescales can be learned by a periodic multi-agent policy. To learn such policies, we propose a policy gradient algorithm that parameterizes the actor and critic with phase-functioned neural networks, which provide an inductive bias for periodicity. The framework's ability to effectively learn multi-timescale policies is validated on a gridworld and building energy management environment.
- Research Organization:
- National Renewable Energy Laboratory (NREL), Golden, CO (United States)
- Sponsoring Organization:
- USDOE National Renewable Energy Laboratory (NREL), Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Energy Efficiency and Renewable Energy (EERE)
- DOE Contract Number:
- AC36-08GO28308
- OSTI ID:
- 2319191
- Report Number(s):
- NREL/CP-2C00-83437; MainId:84210; UUID:7df43834-a844-4067-9d35-bef0f5435ee8; MainAdminId:71345
- Country of Publication:
- United States
- Language:
- English
Similar Records
PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems
PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control
Distributed Power Allocation for 6-GHz Unlicensed Spectrum Sharing via Multi-agent Deep Reinforcement Learning
Conference
·
Tue Jun 28 00:00:00 EDT 2022
·
OSTI ID:1881415
PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control
Journal Article
·
Fri Jul 29 20:00:00 EDT 2022
· IEEE Transactions on Power Systems
·
OSTI ID:1877584
Distributed Power Allocation for 6-GHz Unlicensed Spectrum Sharing via Multi-agent Deep Reinforcement Learning
Conference
·
Wed Apr 05 00:00:00 EDT 2023
·
OSTI ID:1975104