Graded-Q Reinforcement Learning with Information-Enhanced State Encoder for Hierarchical Collaborative Multi-Vehicle Pursuit

October 24, 2022 · Entered Twilight · 🏛 International Conference on Mobile Ad-hoc and Sensor Networks

Repo contents: .idea, Q_Optimizing_Network.py, README.md, __init__.py, __pycache__, agent, data, env, main.py, mixing_net_model, output, settings.py, smooth.py, train.py

Authors Yiying Yang, Xinhang Li, Zheng Yuan, Qinwen Wang, Chen Xu, Lin Zhang arXiv ID 2210.13470 Category cs.LG: Machine Learning Cross-listed cs.AI Citations 5 Venue International Conference on Mobile Ad-hoc and Sensor Networks Repository https://github.com/ANT-ITS/GQRL-IESE ⭐ 3 Last Checked 3 months ago

Abstract

The multi-vehicle pursuit (MVP), as a problem abstracted from various real-world scenarios, is becoming a hot research topic in Intelligent Transportation System (ITS). The combination of Artificial Intelligence (AI) and connected vehicles has greatly promoted the research development of MVP. However, existing works on MVP pay little attention to the importance of information exchange and cooperation among pursuing vehicles under the complex urban traffic environment. This paper proposed a graded-Q reinforcement learning with information-enhanced state encoder (GQRL-IESE) framework to address this hierarchical collaborative multi-vehicle pursuit (HCMVP) problem. In the GQRL-IESE, a cooperative graded Q scheme is proposed to facilitate the decision-making of pursuing vehicles to improve pursuing efficiency. Each pursuing vehicle further uses a deep Q network (DQN) to make decisions based on its encoded state. A coordinated Q optimizing network adjusts the individual decisions based on the current environment traffic information to obtain the global optimal action set. In addition, an information-enhanced state encoder is designed to extract critical information from multiple perspectives and uses the attention mechanism to assist each pursuing vehicle in effectively determining the target. Extensive experimental results based on SUMO indicate that the total timestep of the proposed GQRL-IESE is less than other methods on average by 47.64%, which demonstrates the excellent pursuing efficiency of the GQRL-IESE. Codes are outsourced in https://github.com/ANT-ITS/GQRL-IESE.