Towards Off-policy Evaluation as a Prerequisite for Real-world Reinforcement Learning in Building Control

Chen, B; Jin, M; Wang, Z; Hong, T; Bergés, M

doi:10.1145/3427773.3427871

Title: Towards Off-policy Evaluation as a Prerequisite for Real-world Reinforcement Learning in Building Control

Conference · Tue Nov 17 00:00:00 EST 2020

DOI:https://doi.org/10.1145/3427773.3427871· OSTI ID:1783135

Chen, B; Jin, M; Wang, Z; Hong, T; Bergés, M

We present an initial study of off-policy evaluation (OPE), a problem prerequisite to real-world reinforcement learning (RL), in the context of building control. OPE is the problem of estimating a policy's performance without running it on the actual system, using historical data from the existing controller. It enables the control engineers to ensure a new, pretrained policy satisfies the performance requirements and safety constraints of a real-world system, prior to interacting with it. While many methods have been developed for OPE, no study has evaluated which ones are suitable for building operational data, which are generated by deterministic policies and have limited coverage of the state-action space. After reviewing existing works and their assumptions, we adopted the approximate model (AM) method. Furthermore, we used bootstrapping to quantify uncertainty and correct for bias. In a simulation study, we evaluated the proposed approach on 10 policies pretrained with imitation learning. On average, the AM method estimated the energy and comfort costs with 1.84% and 14.1% error, respectively.

View Conference

Cite

Export

Save

Research Organization:: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Energy Efficiency and Renewable Energy (EERE), Energy Efficiency Office. Building Technologies Office

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 1783135

Resource Relation:: Conference: RLEM 2020 - Proceedings of the 1st International Workshop on Reinforcement Learning for Energy Management in Buildings and Cities

Country of Publication:: United States

Language:: English

Similar Records

Towards Off-policy Evaluation as a Prerequisite for Real-world Reinforcement Learning in Building Control

Conference · Tue Nov 17 00:00:00 EST 2020 · OSTI ID:1783135

Chen, Bingqing; Jin, Ming; Wang, Zhe; +2 more

Gnu-RL: A Precocial Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy

Conference · Wed Nov 13 00:00:00 EST 2019 · OSTI ID:1783135

Chen, Bingqing; berges, mario; Cai, Zicheng

Reinforcement learning building control approach harnessing imitation learning

Journal Article · Tue Mar 14 00:00:00 EDT 2023 · Energy and AI · OSTI ID:1783135

Dey, Sourav; Marzullo, Thibault; Zhang, Xiangyu; +1 more

Title: Towards Off-policy Evaluation as a Prerequisite for Real-world Reinforcement Learning in Building Control

Citation Formats

Similar Records

Related Subjects