How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods
July 11, 2016 ยท Declared Dead ยท ๐ Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Andrej Risteski
arXiv ID
1607.03183
Category
cs.LG: Machine Learning
Cross-listed
cs.DS,
stat.ML
Citations
25
Venue
Annual Conference Computational Learning Theory
Last Checked
3 months ago
Abstract
We consider the problem of approximating partition functions for Ising models. We make use of recent tools in combinatorial optimization: the Sherali-Adams and Lasserre convex programming hierarchies, in combination with variational methods to get algorithms for calculating partition functions in these families. These techniques give new, non-trivial approximation guarantees for the partition function beyond the regime of correlation decay. They also generalize some classical results from statistical physics about the Curie-Weiss ferromagnetic Ising model, as well as provide a partition function counterpart of classical results about max-cut on dense graphs \cite{arora1995polynomial}. With this, we connect techniques from two apparently disparate research areas -- optimization and counting/partition function approximations. (i.e. \#-P type of problems). Furthermore, we design to the best of our knowledge the first provable, convex variational methods. Though in the literature there are a host of convex versions of variational methods \cite{wainwright2003tree, wainwright2005new, heskes2006convexity, meshi2009convexifying}, they come with no guarantees (apart from some extremely special cases, like e.g. the graph has a single cycle \cite{weiss2000correctness}). We consider dense and low threshold rank graphs, and interestingly, the reason our approach works on these types of graphs is because local correlations propagate to global correlations -- completely the opposite of algorithms based on correlation decay. In the process we design novel entropy approximations based on the low-order moments of a distribution. Our proof techniques are very simple and generic, and likely to be applicable to many other settings other than Ising models.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted