Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

December 07, 2020 Β· Declared Dead Β· πŸ› International Conference on Machine Learning

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Kangqiao Liu, Liu Ziyin, Masahito Ueda arXiv ID 2012.03636 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG Citations 45 Venue International Conference on Machine Learning Last Checked 2 months ago
Abstract
In the vanishing learning rate regime, stochastic gradient descent (SGD) is now relatively well understood. In this work, we propose to study the basic properties of SGD and its variants in the non-vanishing learning rate regime. The focus is on deriving exactly solvable results and discussing their implications. The main contributions of this work are to derive the stationary distribution for discrete-time SGD in a quadratic loss function with and without momentum; in particular, one implication of our result is that the fluctuation caused by discrete-time dynamics takes a distorted shape and is dramatically larger than a continuous-time theory could predict. Examples of applications of the proposed theory considered in this work include the approximation error of variants of SGD, the effect of minibatch noise, the optimal Bayesian inference, the escape rate from a sharp minimum, and the stationary covariance of a few second-order methods including damped Newton's method, natural gradient descent, and Adam.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Machine Learning (Stat)

R.I.P. πŸ‘» Ghosted

Graph Attention Networks

Petar VeličkoviΔ‡, Guillem Cucurull, ... (+4 more)

stat.ML πŸ› ICLR πŸ“š 24.7K cites 8 years ago
R.I.P. πŸ‘» Ghosted

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML πŸ› arXiv πŸ“š 12.0K cites 9 years ago

Died the same way β€” πŸ‘» Ghosted