Understanding Convolutional Neural Networks with Information Theory: An Initial Exploration
April 18, 2018 ยท Entered Twilight ยท ๐ IEEE Transactions on Neural Networks and Learning Systems
"Last commit was 7.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: VGG16.py, cifar_10_loader.py, readme.txt
Authors
Shujian Yu, Kristoffer Wickstrรธm, Robert Jenssen, Jose C. Principe
arXiv ID
1804.06537
Category
cs.LG: Machine Learning
Cross-listed
cs.IT,
stat.ML
Citations
83
Venue
IEEE Transactions on Neural Networks and Learning Systems
Repository
https://github.com/Wickstrom/InfExperiment
โญ 3
Last Checked
3 months ago
Abstract
The matrix-based Renyi's ฮฑ-entropy functional and its multivariate extension were recently developed in terms of the normalized eigenspectrum of a Hermitian matrix of the projected data in a reproducing kernel Hilbert space (RKHS). However, the utility and possible applications of these new estimators are rather new and mostly unknown to practitioners. In this paper, we first show that our estimators enable straightforward measurement of information flow in realistic convolutional neural networks (CNN) without any approximation. Then, we introduce the partial information decomposition (PID) framework and develop three quantities to analyze the synergy and redundancy in convolutional layer representations. Our results validate two fundamental data processing inequalities and reveal some fundamental properties concerning the training of CNN.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal