Combining Bayesian Optimization and Lipschitz Optimization

October 10, 2018 · Declared Dead · 🏛 Machine-mediated learning

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Mohamed Osama Ahmed, Sharan Vaswani, Mark Schmidt arXiv ID 1810.04336 Category cs.LG: Machine Learning Cross-listed stat.ML Citations 24 Venue Machine-mediated learning Last Checked 4 months ago

Abstract

Bayesian optimization and Lipschitz optimization have developed alternative techniques for optimizing black-box functions. They each exploit a different form of prior about the function. In this work, we explore strategies to combine these techniques for better global optimization. In particular, we propose ways to use the Lipschitz continuity assumption within traditional BO algorithms, which we call Lipschitz Bayesian optimization (LBO). This approach does not increase the asymptotic runtime and in some cases drastically improves the performance (while in the worst-case the performance is similar). Indeed, in a particular setting, we prove that using the Lipschitz information yields the same or a better bound on the regret compared to using Bayesian optimization on its own. Moreover, we propose a simple heuristics to estimate the Lipschitz constant, and prove that a growing estimate of the Lipschitz constant is in some sense ``harmless''. Our experiments on 15 datasets with 4 acquisition functions show that in the worst case LBO performs similar to the underlying BO method while in some cases it performs substantially better. Thompson sampling in particular typically saw drastic improvements (as the Lipschitz information corrected for its well-known ``over-exploration'' phenomenon) and its LBO variant often outperformed other acquisition functions.