Boosting Trust Region Policy Optimization by Normalizing Flows Policy
September 27, 2018 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yunhao Tang, Shipra Agrawal
arXiv ID
1809.10326
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.LG,
stat.ML
Citations
31
Venue
arXiv.org
Last Checked
4 months ago
Abstract
We propose to improve trust region policy search with normalizing flows policy. We illustrate that when the trust region is constructed by KL divergence constraints, normalizing flows policy generates samples far from the 'center' of the previous policy iterate, which potentially enables better exploration and helps avoid bad local optima. Through extensive comparisons, we show that the normalizing flows policy significantly improves upon baseline architectures especially on high-dimensional tasks with complex dynamics.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted