Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
November 03, 2022 Β· Declared Dead Β· π Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Kimon Antonakopoulos, Ali Kavis, Volkan Cevher
arXiv ID
2211.01832
Category
math.OC: Optimization & Control
Cross-listed
cs.LG,
stat.ML
Citations
14
Venue
Neural Information Processing Systems
Last Checked
4 months ago
Abstract
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(Ο/ \sqrt{T})$ convergence when the oracle feedback is stochastic with variance $Ο^2$, and improves its convergence to $O( 1 / T^3)$ with deterministic oracles, where $T$ is the number of iterations. Our method also interpolates these rates without knowing the nature of the oracle apriori, which is enabled by a parameter-free adaptive step-size that is oblivious to the knowledge of smoothness modulus, variance bounds and the diameter of the constrained set. To our knowledge, this is the first universal algorithm with such global guarantees within the second-order optimization literature.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Optimization & Control
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Local SGD Converges Fast and Communicates Little
R.I.P.
π»
Ghosted
On Lazy Training in Differentiable Programming
π
π
The Cartographer
A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications
R.I.P.
π»
Ghosted
Learned Primal-dual Reconstruction
R.I.P.
π»
Ghosted
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted