R.I.P.
π»
Ghosted
Bregman meets LΓ©vy: Stochastic mirror descent with heavy-tailed noise in continuous and discrete time
June 02, 2026 Β· Grace Period Β· π the proceedings of ICML 2026
Authors
Pierre-Louis Cauvin, Panayotis Mertikopoulos
arXiv ID
2606.03769
Category
math.OC: Optimization & Control
Cross-listed
cs.LG,
math.PR
Citations
0
Venue
the proceedings of ICML 2026
Abstract
We study the robustness of stochastic mirror descent (SMD) under heavy-tailed noise, focusing on whether the method retains its convergence guarantees when run with infinite-variance stochastic gradient input. To address this question in a principled manner, we begin by introducing a continuous-time model of SMD as a stochastic differential equation (SDE) driven by a centered LΓ©vy noise process with finite $p$-th order moments, $1 < p \leq 2$. This scheme -- which we call the LΓ©vy mirror flow (LMF) -- arises naturally as the scaling limit of SMD in the presence of heavy-tailed noise. In particular, when $p < 2$ -- the heavy noise regime -- the trajectories of LMF generically exhibit jump discontinuities of arbitrary magnitude which, if frequent enough, lead to infinite variance. Nonetheless, despite this highly singular behavior, we show that LMF attains $Ξ΅$-optimality within $\mathcal{O}(Ξ΅^{-p/(p-1)})$ time in the convex case, and within $\mathcal{\tilde O}(Ξ΅^{-1/(p-1)})$ time for (relatively) strongly convex objectives. These guarantees provide a transparent characterization of the impact of frequent long jumps on the convergence of the process, and percolate to a series of matching discrete-time guarantees for several variants of SMD under heavy-tailed noise.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Optimization & Control
R.I.P.
π»
Ghosted
Local SGD Converges Fast and Communicates Little
R.I.P.
π»
Ghosted
On Lazy Training in Differentiable Programming
π
π
The Cartographer
A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications
R.I.P.
π»
Ghosted
Learned Primal-dual Reconstruction
R.I.P.
π»
Ghosted