Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

November 06, 2019 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Jeffrey Negrea, Mahdi Haghifam, Gintare Karolina Dziugaite, Ashish Khisti, Daniel M. Roy arXiv ID 1911.02151 Category stat.ML: Machine Learning (Stat) Cross-listed cs.IT, cs.LG Citations 167 Venue Neural Information Processing Systems Last Checked 3 months ago

Abstract

In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface. As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Machine Learning (Stat)

🔮 🔮 The Ethereal

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, Jeff Dean

stat.ML 🏛 arXiv 📚 22.9K cites 11 years ago

🔮 🔮 The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML 🏛 arXiv 📚 12.0K cites 9 years ago

🔮 🔮 The Ethereal

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell

stat.ML 🏛 NeurIPS 📚 7.0K cites 9 years ago

R.I.P. 👻 Ghosted

Variational Inference with Normalizing Flows

Danilo Jimenez Rezende, Shakir Mohamed

stat.ML 🏛 ICML 📚 4.7K cites 11 years ago

📚 📚 The Cartographer

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez, Been Kim

stat.ML 🏛 arXiv 📚 4.7K cites 9 years ago

R.I.P. 👻 Ghosted

Optimization Methods for Large-Scale Machine Learning

Léon Bottou, Frank E. Curtis, Jorge Nocedal

stat.ML 🏛 SIAM Review 📚 3.6K cites 10 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago