Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

November 21, 2015 · Declared Dead · 🏛 AAAI Conference on Artificial Intelligence

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chun Kai Ling, Kian Hsiang Low, Patrick Jaillet arXiv ID 1511.06890 Category stat.ML: Machine Learning (Stat) Cross-listed cs.AI, cs.LG, cs.RO Citations 80 Venue AAAI Conference on Artificial Intelligence Last Checked 2 months ago

Abstract

This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of epsilon-GPP with performance guarantee. We empirically demonstrate the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.