How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?

May 21, 2018 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh arXiv ID 1805.07883 Category stat.ML: Machine Learning (Stat) Cross-listed cs.AI, cs.CV, cs.LG Citations 60 Venue Neural Information Processing Systems Last Checked 3 months ago

Abstract

It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters. We initiate the study of rigorously characterizing the sample-complexity of estimating CNNs and RNNs. We show that the sample-complexity to learn CNNs and RNNs scales linearly with their intrinsic dimension and this sample-complexity is much smaller than for their FNN counterparts. For both CNNs and RNNs, we also present lower bounds showing our sample complexities are tight up to logarithmic factors. Our main technical tools for deriving these results are a localized empirical process analysis and a new technical lemma characterizing the convolutional and recurrent structure. We believe that these tools may inspire further developments in understanding CNNs and RNNs.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Machine Learning (Stat)

🔮 🔮 The Ethereal

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, Jeff Dean

stat.ML 🏛 arXiv 📚 22.9K cites 11 years ago

🔮 🔮 The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML 🏛 arXiv 📚 12.0K cites 9 years ago

🔮 🔮 The Ethereal

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell

stat.ML 🏛 NeurIPS 📚 7.0K cites 9 years ago

R.I.P. 👻 Ghosted

Variational Inference with Normalizing Flows

Danilo Jimenez Rezende, Shakir Mohamed

stat.ML 🏛 ICML 📚 4.7K cites 11 years ago

📚 📚 The Cartographer

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez, Been Kim

stat.ML 🏛 arXiv 📚 4.7K cites 9 years ago

R.I.P. 👻 Ghosted

Optimization Methods for Large-Scale Machine Learning

Léon Bottou, Frank E. Curtis, Jorge Nocedal

stat.ML 🏛 SIAM Review 📚 3.6K cites 10 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago