A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
December 05, 2022 Β· Declared Dead Β· π IEEE Transactions on Systems, Man, and Cybernetics: Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jiajun Chai, Wenzhang Chen, Yuanheng Zhu, Zong-xin Yao, Dongbin Zhao
arXiv ID
2212.03830
Category
cs.AI: Artificial Intelligence
Cross-listed
eess.SY
Citations
61
Venue
IEEE Transactions on Systems, Man, and Cybernetics: Systems
Last Checked
3 months ago
Abstract
Unmanned combat air vehicle (UCAV) combat is a challenging scenario with continuous action space. In this paper, we propose a general hierarchical framework to resolve the within-vision-range (WVR) air-to-air combat problem under 6 dimensions of degree (6-DOF) dynamics. The core idea is to divide the whole decision process into two loops and use reinforcement learning (RL) to solve them separately. The outer loop takes into account the current combat situation and decides the expected macro behavior of the aircraft according to a combat strategy. Then the inner loop tracks the macro behavior with a flight controller by calculating the actual input signals for the aircraft. We design the Markov decision process for both the outer loop strategy and inner loop controller, and train them by proximal policy optimization (PPO) algorithm. For the inner loop controller, we design an effective reward function to accurately track various macro behavior. For the outer loop strategy, we further adopt a fictitious self-play mechanism to improve the combat performance by constantly combating against the historical strategies. Experiment results show that the inner loop controller can achieve better tracking performance than fine-tuned PID controller, and the outer loop strategy can perform complex maneuvers to get higher and higher winning rate, with the generation evolves.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted