Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models
February 19, 2016 ยท Declared Dead ยท ๐ Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Bernardo รvila Pires, Csaba Szepesvรกri
arXiv ID
1602.06346
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.LG
Citations
23
Venue
Annual Conference Computational Learning Theory
Last Checked
3 months ago
Abstract
In this paper we study a model-based approach to calculating approximately optimal policies in Markovian Decision Processes. In particular, we derive novel bounds on the loss of using a policy derived from a factored linear model, a class of models which generalize numerous previous models out of those that come with strong computational guarantees. For the first time in the literature, we derive performance bounds for model-based techniques where the model inaccuracy is measured in weighted norms. Moreover, our bounds show a decreased sensitivity to the discount factor and, unlike similar bounds derived for other approaches, they are insensitive to measure mismatch. Similarly to previous works, our proofs are also based on contraction arguments, but with the main differences that we use carefully constructed norms building on Banach lattices, and the contraction property is only assumed for operators acting on "compressed" spaces, thus weakening previous assumptions, while strengthening previous results.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning (Stat)
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Layer Normalization
๐ฎ
๐ฎ
The Ethereal
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
R.I.P.
๐ป
Ghosted
Variational Inference with Normalizing Flows
๐
๐
The Cartographer
Towards A Rigorous Science of Interpretable Machine Learning
R.I.P.
๐ป
Ghosted
Optimization Methods for Large-Scale Machine Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted