Parallel Markov Chain Monte Carlo for Bayesian Hierarchical Models with Big Data, in Two Stages

December 16, 2017 · Declared Dead · 🏛 Journal of Applied Statistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zheng Wei, Erin M. Conlon arXiv ID 1712.05907 Category stat.ME Cross-listed cs.DC, stat.CO, stat.ML Citations 3 Venue Journal of Applied Statistics Last Checked 2 months ago

Abstract

Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group-specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — stat.ME

R.I.P. 👻 Ghosted

Causal inference using invariant prediction: identification and confidence intervals

Jonas Peters, Peter Bühlmann, Nicolai Meinshausen

stat.ME 🏛 J.RSSSB 📚 1.1K cites 11 years ago

R.I.P. 👻 Ghosted

Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology

Alexei Botchkarev

stat.ME 🏛 Interdisciplinary Journal of Information, Knowledge, and Management 📚 671 cites 7 years ago

R.I.P. 👻 Ghosted

External Validity: From Do-Calculus to Transportability Across Populations

Judea Pearl, Elias Bareinboim

stat.ME 🏛 Probabilistic and Causal Inference 📚 366 cites 11 years ago

R.I.P. 👻 Ghosted

Least Ambiguous Set-Valued Classifiers with Bounded Error Levels

Mauricio Sadinle, Jing Lei, Larry Wasserman

stat.ME 🏛 J.ASA 📚 318 cites 9 years ago

R.I.P. 👻 Ghosted

Doubly Robust Policy Evaluation and Optimization

Miroslav Dudík, Dumitru Erhan, ... (+2 more)

stat.ME 🏛 arXiv 📚 308 cites 11 years ago

R.I.P. 👻 Ghosted

Comparison of Bayesian predictive methods for model selection

Juho Piironen, Aki Vehtari

stat.ME 🏛 Statistics and computing 📚 304 cites 11 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago