An efficient nonconvex reformulation of stagewise convex optimization problems

October 27, 2020 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Rudy Bunel, Oliver Hinder, Srinadh Bhojanapalli, Krishnamurthy, Dvijotham arXiv ID 2010.14322 Category math.OC: Optimization & Control Cross-listed cs.AI, cs.LG, cs.NE Citations 17 Venue Neural Information Processing Systems Last Checked 4 months ago

Abstract

Convex optimization problems with staged structure appear in several contexts, including optimal control, verification of deep neural networks, and isotonic regression. Off-the-shelf solvers can solve these problems but may scale poorly. We develop a nonconvex reformulation designed to exploit this staged structure. Our reformulation has only simple bound constraints, enabling solution via projected gradient methods and their accelerated variants. The method automatically generates a sequence of primal and dual feasible solutions to the original convex problem, making optimality certification easy. We establish theoretical properties of the nonconvex formulation, showing that it is (almost) free of spurious local minima and has the same global optimum as the convex problem. We modify PGD to avoid spurious local minimizers so it always converges to the global minimizer. For neural network verification, our approach obtains small duality gaps in only a few gradient steps. Consequently, it can quickly solve large-scale verification problems faster than both off-the-shelf and specialized solvers.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Optimization & Control

R.I.P. 👻 Ghosted

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent

Xiangru Lian, Ce Zhang, ... (+4 more)

math.OC 🏛 NeurIPS 📚 1.4K cites 9 years ago

R.I.P. 👻 Ghosted

Local SGD Converges Fast and Communicates Little

Sebastian U. Stich

math.OC 🏛 ICLR 📚 1.2K cites 8 years ago

R.I.P. 👻 Ghosted

On Lazy Training in Differentiable Programming

Lenaic Chizat, Edouard Oyallon, Francis Bach

math.OC 🏛 NeurIPS 📚 930 cites 7 years ago

📚 📚 The Cartographer

A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications

Ankur Sinha, Pekka Malo, Kalyanmoy Deb

math.OC 🏛 IEEE TEC 📚 840 cites 9 years ago

R.I.P. 👻 Ghosted

Learned Primal-dual Reconstruction

Jonas Adler, Ozan Öktem

math.OC 🏛 IEEE TMI 📚 834 cites 8 years ago

R.I.P. 👻 Ghosted

On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

Lenaic Chizat, Francis Bach

math.OC 🏛 NeurIPS 📚 805 cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 9 years ago