Maximal Relevance and Optimal Learning Machines

September 27, 2019 · Declared Dead · + Add venue

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors O Duranthon, M Marsili, R Xie arXiv ID 1909.12792 Category physics.data-an Cross-listed cond-mat.stat-mech, cs.IT, cs.LG Citations 0 Last Checked 3 months ago

Abstract

We show that the mutual information between the representation of a learning machine and the hidden features that it extracts from data is bounded from below by the relevance, which is the entropy of the model's energy distribution. Models with maximal relevance -- that we call Optimal Learning Machines (OLM) -- are hence expected to extract maximally informative representations. We explore this principle in a range of models. For fully connected Ising models and we show that {\em i)} OLM are characterised by inhomogeneous distributions of couplings, and that {\em ii)} their learning performance is affected by sub-extensive features that are elusive to a thermodynamic treatment. On specific learning tasks, we find that likelihood maximisation is achieved by models with maximal relevance. Training of Restricted Boltzmann Machines on the MNIST benchmark shows that learning is associated with a broadening of the spectrum of energy levels and that the internal representation of the hidden layer approaches the maximal relevance that can be achieved in a finite dataset. Finally, we discuss a Gaussian learning machine that clarifies that learning hidden features is conceptually different from parameter estimation.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — physics.data-an

R.I.P. 👻 Ghosted

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

Ilka Antcheva, Maarten Ballintijn, ... (+25 more)

physics.data-an 🏛 Computer Physics Communications 📚 716 cites 10 years ago

R.I.P. 👻 Ghosted

A deep convolutional neural network approach to single-particle recognition in cryo-electron microscopy

Yanan Zhu, Qi Ouyang, Youdong Mao

physics.data-an 🏛 BMC Bioinformatics 📚 136 cites 10 years ago

R.I.P. 👻 Ghosted

The Pandora Software Development Kit for Pattern Recognition

J. S. Marshall, M. A. Thomson

physics.data-an 🏛 The European Physical Journal C 📚 128 cites 11 years ago

R.I.P. 👻 Ghosted

Emergence of Compositional Representations in Restricted Boltzmann Machines

Jérôme Tubiana, Rémi Monasson

physics.data-an 🏛 Phys. Rev. Lett. 📚 99 cites 9 years ago

R.I.P. 👻 Ghosted

Investigating echo state networks dynamics by means of recurrence analysis

Filippo Maria Bianchi, Lorenzo Livi, Cesare Alippi

physics.data-an 🏛 IEEE TNNLS 📚 93 cites 10 years ago

R.I.P. 👻 Ghosted

Discovering state-parameter mappings in subsurface models using generative adversarial networks

Alexander Y. Sun

physics.data-an 🏛 Geophysical Research Letters 📚 85 cites 7 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago