Implications of stop-and-go traffic on training learning-based car-following control
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Georgia Institute of Technology, Atlanta, GA (United States)
- Univ. of South Florida, Tampa, FL (United States)
Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. Herein, this study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- AC05-00OR22725
- OSTI ID:
- 2438666
- Journal Information:
- Transportation Research Part C: Emerging Technologies, Journal Name: Transportation Research Part C: Emerging Technologies Vol. 168; ISSN 0968-090X
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Impact of connected and automated vehicles on traffic flow
A Predictive Deep-Reinforcement-Learning-Based Connected Automated Vehicle Anticipatory Longitudinal Control in a Mixed Traffic Lane Change Condition