Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion

February 05, 2018 Β· Declared Dead Β· πŸ› IEEE Transactions on Neural Networks and Learning Systems

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Isaac J. Sledge, Matthew S. Emigh, Jose C. Principe arXiv ID 1802.01518 Category cs.AI: Artificial Intelligence Citations 20 Venue IEEE Transactions on Neural Networks and Learning Systems Last Checked 4 months ago
Abstract
Reinforcement learning in environments with many action-state pairs is challenging. At issue is the number of episodes needed to thoroughly search the policy space. Most conventional heuristics address this search problem in a stochastic manner. This can leave large portions of the policy space unvisited during the early training stages. In this paper, we propose an uncertainty-based, information-theoretic approach for performing guided stochastic searches that more effectively cover the policy space. Our approach is based on the value of information, a criterion that provides the optimal trade-off between expected costs and the granularity of the search process. The value of information yields a stochastic routine for choosing actions during learning that can explore the policy space in a coarse to fine manner. We augment this criterion with a state-transition uncertainty factor, which guides the search process into previously unexplored regions of the policy space.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence

Died the same way β€” πŸ‘» Ghosted