A Review of Off-Policy Evaluation in Reinforcement Learning
December 13, 2022 Β· The Cartographer Β· π arXiv.org
"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Review of Off-Policy Evaluation in Reinforcement Learning"
Evidence collected by the PWNC Scanner
Authors
Masatoshi Uehara, Chengchun Shi, Nathan Kallus
arXiv ID
2212.06355
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.LG,
math.ST,
stat.ME
Citations
107
Venue
arXiv.org
Last Checked
1 day ago
Abstract
Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Machine Learning (Stat)
ποΈ
ποΈ
Transcended
ποΈ
ποΈ
Transcended
Layer Normalization
ποΈ
ποΈ
Transcended
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
R.I.P.
π»
Ghosted
Variational Inference with Normalizing Flows
π
π
The Cartographer
Towards A Rigorous Science of Interpretable Machine Learning
R.I.P.
π»
Ghosted