Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization
December 25, 2015 Β· Declared Dead Β· π International Conference on Artificial Intelligence and Statistics
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Changyou Chen, David Carlson, Zhe Gan, Chunyuan Li, Lawrence Carin
arXiv ID
1512.07962
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.LG
Citations
93
Venue
International Conference on Artificial Intelligence and Statistics
Last Checked
1 month ago
Abstract
Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SGMCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adaptive element-wise momentum weights. The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight. Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima. Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Machine Learning (Stat)
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Distilling the Knowledge in a Neural Network
R.I.P.
π»
Ghosted
Layer Normalization
R.I.P.
π»
Ghosted
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
R.I.P.
π»
Ghosted
Domain-Adversarial Training of Neural Networks
R.I.P.
π»
Ghosted
Deep Learning with Differential Privacy
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted