$H_\infty$ Model-free Reinforcement Learning with Robust Stability Guarantee

November 07, 2019 · Entered Twilight · 🏛 arXiv.org

"Last commit was 6.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .DS_Store, ENV, LAC, README.md, disturber, dreamer, envs, figures, h_inf_rl, h_inf_rl_original, logger.py, main.py, my_plottrer.py, pool, robustness_eval.py, variant.py

Authors Minghao Han, Yuan Tian, Lixian Zhang, Jun Wang, Wei Pan arXiv ID 1911.02875 Category cs.LG: Machine Learning Cross-listed cs.RO, eess.SY Citations 28 Venue arXiv.org Repository https://github.com/RobustStabilityGuaranteeRL/RobustStabilityGuaranteeRL ⭐ 11 Last Checked 4 months ago

Abstract

Reinforcement learning is showing great potentials in robotics applications, including autonomous driving, robot manipulation and locomotion. However, with complex uncertainties in the real-world environment, it is difficult to guarantee the successful generalization and sim-to-real transfer of learned policies theoretically. In this paper, we introduce and extend the idea of robust stability and $H_\infty$ control to design policies with both stability and robustness guarantee. Specifically, a sample-based approach for analyzing the Lyapunov stability and performance robustness of a learning-based control system is proposed. Based on the theoretical results, a maximum entropy algorithm is developed for searching Lyapunov function and designing a policy with provable robust stability guarantee. Without any specific domain knowledge, our method can find a policy that is robust to various uncertainties and generalizes well to different test environments. In our experiments, we show that our method achieves better robustness to both large impulsive disturbances and parametric variations in the environment than the state-of-art results in both robust and generic RL, as well as classic control. Anonymous code is available to reproduce the experimental results at https://github.com/RobustStabilityGuaranteeRL/RobustStabilityGuaranteeRL.