Path Length Bounds for Gradient Descent and Flow

August 02, 2019 ยท Declared Dead ยท ๐Ÿ› Journal of machine learning research

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas arXiv ID 1908.01089 Category cs.LG: Machine Learning Cross-listed cs.AI, math.OC, stat.ML Citations 15 Venue Journal of machine learning research Last Checked 4 months ago
Abstract
We derive bounds on the path length $ฮถ$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor $(1-c)$, then $ฮถ$ is at most $\mathcal{O}(1/c)$; (b) under the Polyak-Kurdyka-Lojasiewicz (PKL) condition, $ฮถ$ is at most $\mathcal{O}(\sqrtฮบ)$, where $ฮบ$ is the condition number, and at least $\widetildeฮฉ(\sqrt{d} \wedge ฮบ^{1/4})$; (c) for quadratics, $ฮถ$ is $ฮ˜(\min\{\sqrt{d},\sqrt{\log ฮบ}\})$ and in some cases can be independent of $ฮบ$; (d) assuming just convexity, $ฮถ$ can be at most $2^{4d\log d}$; (e) for separable quasiconvex functions, $ฮถ$ is $ฮ˜(\sqrt{d})$. Thus, we advance current understanding of the properties of GD and GF curves beyond rates of convergence. We expect our techniques to facilitate future studies for other algorithms.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted