Safe Option-Critic: Learning Safety in the Option-Critic Architecture

July 21, 2018 Β· Declared Dead Β· πŸ› Knowledge engineering review (Print)

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Arushi Jain, Khimya Khetarpal, Doina Precup arXiv ID 1807.08060 Category cs.AI: Artificial Intelligence Citations 28 Venue Knowledge engineering review (Print) Last Checked 4 months ago
Abstract
Designing hierarchical reinforcement learning algorithms that exhibit safe behaviour is not only vital for practical applications but also, facilitates a better understanding of an agent's decisions. We tackle this problem in the options framework, a particular way to specify temporally abstract actions which allow an agent to use sub-policies with start and end conditions. We consider a behaviour as safe that avoids regions of state-space with high uncertainty in the outcomes of actions. We propose an optimization objective that learns safe options by encouraging the agent to visit states with higher behavioural consistency. The proposed objective results in a trade-off between maximizing the standard expected return and minimizing the effect of model uncertainty in the return. We propose a policy gradient algorithm to optimize the constrained objective function. We examine the quantitative and qualitative behaviour of the proposed approach in a tabular grid-world, continuous-state puddle-world, and three games from the Arcade Learning Environment: Ms.Pacman, Amidar, and Q*Bert. Our approach achieves a reduction in the variance of return, boosts performance in environments with intrinsic variability in the reward structure, and compares favorably both with primitive actions as well as with risk-neutral options.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence

Died the same way β€” πŸ‘» Ghosted