Constrained Deep Reinforcement Learning for Energy Sustainable Multi-UAV Based Random Access IoT Networks With NOMA

Khairy, Sami; Balaprakash, Prasanna; Cai, Lin X.; Cheng, Yu

doi:10.1109/jsac.2020.3018804

Title: Constrained Deep Reinforcement Learning for Energy Sustainable Multi-UAV Based Random Access IoT Networks With NOMA

Journal Article · Tue Aug 25 00:00:00 EDT 2020 · IEEE Journal on Selected Areas in Communications

DOI:https://doi.org/10.1109/jsac.2020.3018804· OSTI ID:1776838

^[1];

^[2];

^[1];

^[1]

Illinois Institute of Technology, Chicago, IL (United States)
Argonne National Lab. (ANL), Argonne, IL (United States)

In this paper, we apply the Non-Orthogonal Multiple Access (NOMA) technique to improve the massive channel access of a wireless IoT network where solar-powered Unmanned Aerial Vehicles (UAVs) relay data from IoT devices to remote servers. Specifically, IoT devices contend for accessing the shared wireless channel using an adaptive p-persistent slotted Aloha protocol; and the solar-powered UAVs adopt Successive Interference Cancellation (SIC) to decode multiple received data from IoT devices to improve access efficiency. To enable an energy-sustainable capacity-optimal network, we study the joint problem of dynamic multi-UAV altitude control and multi-cell wireless channel access management of IoT devices as a stochastic control problem with multiple energy constraints. We first formulate this problem as a Constrained Markov Decision Process (CMDP), and propose an online model-free Constrained Deep Reinforcement Learning (CDRL) algorithm based on Lagrangian primal-dual policy optimization to solve the CMDP. Extensive simulations demonstrate that our proposed algorithm learns a cooperative policy in which the altitude of UAVs and channel access probability of IoT devices are dynamically controlled to attain the maximal long-term network capacity while ensuring energy sustainability of UAVs, outperforming baseline schemes. The proposed CDRL agent can be trained on a small network, yet the learned policy can efficiently manage networks with a massive number of IoT devices and varying initial states, which can amortize the cost of training the CDRL agent.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Argonne National Laboratory (ANL), Argonne, IL (United States)

Sponsoring Organization:: USDOE Office of Science (SC); National Science Foundation (NSF)

Grant/Contract Number:: AC02-06CH11357

OSTI ID:: 1776838

Journal Information:: IEEE Journal on Selected Areas in Communications, Vol. 39, Issue 4; ISSN 0733-8716

Publisher:: IEEECopyright Statement

Country of Publication:: United States

Language:: English

References (33)

Playing Atari with Deep Reinforcement Learning Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David arXiv https://doi.org/10.48550/arxiv.1312.5602	preprint	January 2013
LoRa technology MAC layer operations and Research issues Fehri, Chékra El; Kassab, Mohamed; Abdellatif, Slim Procedia Computer Science, Vol. 130 https://doi.org/10.1016/j.procs.2018.04.162	journal	January 2018
An actor-critic algorithm for constrained Markov decision processes Borkar, V. S. Systems & Control Letters, Vol. 54, Issue 3 https://doi.org/10.1016/j.sysconle.2004.08.007	journal	March 2005
Optimal 3D-Trajectory Design and Resource Allocation for Solar-Powered UAV Communication Systems Sun, Yan; Xu, Dongfang; Ng, Derrick Wing Kwan IEEE Transactions on Communications, Vol. 67, Issue 6 https://doi.org/10.1109/TCOMM.2019.2900630	journal	June 2019
Optimal Path Planning of Solar-Powered UAV Using Gravitational Potential Energy Lee, Joo-Seok; Yu, Kee-Ho IEEE Transactions on Aerospace and Electronic Systems, Vol. 53, Issue 3 https://doi.org/10.1109/TAES.2017.2671522	journal	June 2017
Quadrotor Helicopter Flight Dynamics and Control: Theory and Experiment Hoffmann, Gabriel; Huang, Haomiao; Waslander, Steven AIAA Guidance, Navigation and Control Conference and Exhibit https://doi.org/10.2514/6.2007-6461	conference	June 2007
An actor–critic algorithm with function approximation for discounted cost constrained Markov decision processes Bhatnagar, Shalabh Systems & Control Letters, Vol. 59, Issue 12 https://doi.org/10.1016/j.sysconle.2010.08.013	journal	December 2010
An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes Bhatnagar, Shalabh; Lakshmanan, K. Journal of Optimization Theory and Applications, Vol. 153, Issue 3 https://doi.org/10.1007/s10957-012-9989-5	journal	January 2012
A Survey of Motion Planning Algorithms from the Perspective of Autonomous UAV Guidance Goerzen, C.; Kong, Z.; Mettler, B. Journal of Intelligent and Robotic Systems, Vol. 57, Issue 1-4 https://doi.org/10.1007/s10846-009-9383-1	journal	November 2009
Optimizing Non-Orthogonal Multiple Access in Random Access Networks Chen, Ziru; Liu, Yong; Khairy, Sami 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring) https://doi.org/10.1109/VTC2020-Spring48590.2020.9128380	conference	May 2020
Thrust Control for Multirotor Aerial Vehicles Bangura, Moses; Mahony, Robert IEEE Transactions on Robotics, Vol. 33, Issue 2 https://doi.org/10.1109/TRO.2016.2633562	journal	April 2017
A Renewal Theory Based Analytical Model for Multi-Channel Random Access in IEEE 802.11ac/ax Khairy, Sami; Han, Mengqi; Cai, Lin X. IEEE Transactions on Mobile Computing, Vol. 18, Issue 5 https://doi.org/10.1109/TMC.2018.2857799	journal	May 2019
Solar powered UAV: Design and experiments Morton, Scott; D'Sa, Ruben; Papanikolopoulos, Nikolaos 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) https://doi.org/10.1109/IROS.2015.7353711	conference	September 2015
Perpetual flight with a small solar-powered UAV: Flight results, performance analysis and model validation Oettershagen, Philipp; Melzer, Amir; Mantel, Thomas 2016 IEEE Aerospace Conference https://doi.org/10.1109/AERO.2016.7500855	conference	March 2016
UAV-Enabled Communication Using NOMA Nasir, Ali Arshad; Tuan, Hoang Duong; Duong, Trung Q. IEEE Transactions on Communications, Vol. 67, Issue 7 https://doi.org/10.1109/TCOMM.2019.2906622	journal	July 2019
Joint Trajectory and Precoding Optimization for UAV-Assisted NOMA Networks Zhao, Nan; Pang, Xiaowei; Li, Zan IEEE Transactions on Communications, Vol. 67, Issue 5 https://doi.org/10.1109/TCOMM.2019.2895831	journal	May 2019
Least squares quantization in PCM Lloyd, S. IEEE Transactions on Information Theory, Vol. 28, Issue 2 https://doi.org/10.1109/TIT.1982.1056489	journal	March 1982
Efficient Deployment of Multiple Unmanned Aerial Vehicles for Optimal Wireless Coverage Mozaffari, Mohammad; Saad, Walid; Bennis, Mehdi IEEE Communications Letters, Vol. 20, Issue 8 https://doi.org/10.1109/LCOMM.2016.2578312	journal	August 2016
Deployment Algorithms for UAV Airborne Networks Toward On-Demand Coverage Zhao, Haitao; Wang, Haijun; Wu, Weiyu IEEE Journal on Selected Areas in Communications, Vol. 36, Issue 9 https://doi.org/10.1109/JSAC.2018.2864376	journal	September 2018
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints Geibel, P.; Wysotzki, F. Journal of Artificial Intelligence Research, Vol. 24 https://doi.org/10.1613/jair.1666	journal	July 2005
A Method for Optimized Deployment of a Network of Surveillance Aerial Drones Savkin, Andrey V.; Huang, Hailong IEEE Systems Journal, Vol. 13, Issue 4 https://doi.org/10.1109/JSYST.2019.2910080	journal	December 2019
Joint Trajectory and Communication Design for Multi-UAV Enabled Wireless Networks Wu, Qingqing; Zeng, Yong; Zhang, Rui IEEE Transactions on Wireless Communications, Vol. 17, Issue 3 https://doi.org/10.1109/TWC.2017.2789293	journal	March 2018
Throughput Maximization in Multi-UAV Enabled Communication Systems With Difference Consideration Xu, Yu; Xiao, Lin; Yang, Dingcheng IEEE Access, Vol. 6 https://doi.org/10.1109/ACCESS.2018.2872736	journal	January 2018
Deep Reinforcement Learning for Minimizing Age-of-Information in UAV-Assisted Networks Abd-Elmagid, Mohamed A.; Ferdowsi, Aidin; Dhillon, Harpreet S. 2019 IEEE Global Communications Conference (GLOBECOM) https://doi.org/10.1109/GLOBECOM38437.2019.9013924	conference	December 2019
Trajectory Design and Power Control for Multi-UAV Assisted Wireless Networks: A Machine Learning Approach Liu, Xiao; Liu, Yuanwei; Chen, Yue IEEE Transactions on Vehicular Technology, Vol. 68, Issue 8 https://doi.org/10.1109/TVT.2019.2920284	journal	August 2019
Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol Hu, Jingzhi; Zhang, Hongliang; Song, Lingyang IEEE Internet of Things Journal, Vol. 6, Issue 4 https://doi.org/10.1109/JIOT.2018.2876513	journal	August 2019
Ultra-Reliable IoT Communications with UAVs: A Swarm Use Case Yuan, Zhenhui; Jin, Jie; Sun, Lingling IEEE Communications Magazine, Vol. 56, Issue 12 https://doi.org/10.1109/MCOM.2018.1800161	journal	December 2018
A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems Mozaffari, Mohammad; Saad, Walid; Bennis, Mehdi IEEE Communications Surveys & Tutorials, Vol. 21, Issue 3 https://doi.org/10.1109/COMST.2019.2902862	journal	January 2019
Sustainable Wireless IoT Networks With RF Energy Charging Over Wi-Fi (CoWiFi) Khairy, Sami; Han, Mengqi; Cai, Lin X. IEEE Internet of Things Journal, Vol. 6, Issue 6 https://doi.org/10.1109/JIOT.2019.2936837	journal	December 2019
NOMA-Based Random Access With Multichannel ALOHA Choi, Jinho IEEE Journal on Selected Areas in Communications, Vol. 35, Issue 12 https://doi.org/10.1109/JSAC.2017.2766778	journal	December 2017
A Game-Theoretic Approach for NOMA-ALOHA Choi, Jinho 2018 European Conference on Networks and Communications (EuCNC) https://doi.org/10.1109/EuCNC.2018.8442662	conference	June 2018
Nonorthogonal Random Access for 5G Mobile Communication Systems Seo, Jun-Bae; Jung, Bang Chul; Jin, Hu IEEE Transactions on Vehicular Technology, Vol. 67, Issue 8 https://doi.org/10.1109/TVT.2018.2825462	journal	August 2018
Placement Optimization of UAV-Mounted Mobile Base Stations Lyu, Jiangbin; Zeng, Yong; Zhang, Rui IEEE Communications Letters, Vol. 21, Issue 3 https://doi.org/10.1109/LCOMM.2016.2633248	journal	March 2017

Cited By (1)

Data-Driven Random Access Optimization in Multi-Cell IoT Networks Using NOMA Khairy, Sami; Balaprakash, Prasanna; Cai, Lin X. IEEE Transactions on Wireless Communications, Vol. 21, Issue 7 https://doi.org/10.1109/twc.2021.3134949	journal	July 2022

Similar Records

Enhancing Physical Layer Security for NOMA Transmission in mmWave Drone Networks

Conference · Mon Dec 03 00:00:00 EST 2018 · OSTI ID:1776838

Rupasinghe, Nadasanka; Yapici, Yanuz; Guvenc, Ismail; +2 more

Autonomous Wireless Technology Detection in Seamless IoT Applications

Journal Article · Mon May 09 00:00:00 EDT 2022 · IEEE Internet of Things Journal (Online) · OSTI ID:1776838

Kumar, Venkataramani; Li, Fuhao; Ye, Feng; +1 more

Seamless Cross-Technology Communication Platform for IoT Applications (Final Report)

Technical Report · Mon Sep 27 00:00:00 EDT 2021 · OSTI ID:1776838

Subramanyam, Guru

Related Subjects

42 ENGINEERING
Constrained Deep Reinforcement Learning
Non-Orthogonal Multiple Access
Solar-Powered UAVs
Sustainable IoT Networks
UAV altitude control
p-persistent slotted Aloha

Title: Constrained Deep Reinforcement Learning for Energy Sustainable Multi-UAV Based Random Access IoT Networks With NOMA

Citation Formats

References (33)

Cited By (1)

Similar Records

Related Subjects