A Review of Off-Policy Evaluation in Reinforcement Learning

December 13, 2022 Β· The Cartographer Β· πŸ› arXiv.org

πŸ“š THE CARTOGRAPHER: The Cartographer
Survey/review paper β€” maps the landscape rather than implementing a method.

"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Review of Off-Policy Evaluation in Reinforcement Learning"

Evidence collected by the PWNC Scanner

Authors Masatoshi Uehara, Chengchun Shi, Nathan Kallus arXiv ID 2212.06355 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG, math.ST, stat.ME Citations 107 Venue arXiv.org Last Checked 1 day ago
Abstract
Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Machine Learning (Stat)

πŸ›οΈ πŸ›οΈ Transcended

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML πŸ› arXiv πŸ“š 12.0K cites 9 years ago