R.I.P.
๐ป
Ghosted
Gradient Informed Proximal Policy Optimization
December 14, 2023 ยท Entered Twilight ยท ๐ Neural Information Processing Systems
Repo contents: .gitignore, README.md, config, envs, run_func_optim.sh, setup.py, src, train.py
Authors
Sanghyun Son, Laura Yu Zheng, Ryan Sullivan, Yi-Ling Qiao, Ming C. Lin
arXiv ID
2312.08710
Category
cs.LG: Machine Learning
Cross-listed
cs.AI
Citations
16
Venue
Neural Information Processing Systems
Repository
https://github.com/SonSang/gippo
โญ 27
Last Checked
2 months ago
Abstract
We introduce a novel policy learning method that integrates analytical gradients from differentiable environments with the Proximal Policy Optimization (PPO) algorithm. To incorporate analytical gradients into the PPO framework, we introduce the concept of an ฮฑ-policy that stands as a locally superior policy. By adaptively modifying the ฮฑ value, we can effectively manage the influence of analytical policy gradients during learning. To this end, we suggest metrics for assessing the variance and bias of analytical gradients, reducing dependence on these gradients when high variance or bias is detected. Our proposed approach outperforms baseline algorithms in various scenarios, such as function optimization, physics simulations, and traffic control environments. Our code can be found online: https://github.com/SonSang/gippo.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted