Vulcan: A Monte Carlo Algorithm for Large Chance Constrained MDPs with Risk Bounding Functions
September 04, 2018 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Benjamin J Ayton, Brian C Williams
arXiv ID
1809.01220
Category
cs.AI: Artificial Intelligence
Citations
11
Venue
arXiv.org
Last Checked
4 months ago
Abstract
Chance Constrained Markov Decision Processes maximize reward subject to a bounded probability of failure, and have been frequently applied for planning with potentially dangerous outcomes or unknown environments. Solution algorithms have required strong heuristics or have been limited to relatively small problems with up to millions of states, because the optimal action to take from a given state depends on the probability of failure in the rest of the policy, leading to a coupled problem that is difficult to solve. In this paper we examine a generalization of a CCMDP that trades off probability of failure against reward through a functional relationship. We derive a constraint that can be applied to each state history in a policy individually, and which guarantees that the chance constraint will be satisfied. The approach decouples states in the CCMDP, so that large problems can be solved efficiently. We then introduce Vulcan, which uses our constraint in order to apply Monte Carlo Tree Search to CCMDPs. Vulcan can be applied to problems where it is unfeasible to generate the entire state space, and policies must be returned in an anytime manner. We show that Vulcan and its variants run tens to hundreds of times faster than linear programming methods, and over ten times faster than heuristic based methods, all without the need for a heuristic, and returning solutions with a mean suboptimality on the order of a few percent. Finally, we use Vulcan to solve for a chance constrained policy in a CCMDP with over $10^{13}$ states in 3 minutes.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted