Bayesian sparsity

January 8, 2019 — October 25, 2022

Bayes

high d

information

model selection

regression

sparser than thou

statistics

What if you like the flavours of both Bayesian inference and the implicit model selection of sparse inference? Can you cook Bayesian-Frequentist fusion cuisine with this novelty ingredient? Yes, Bayes model selection has a sparsity flavour.

1 Laplace Prior

Laplace priors on linear regression coefficients, includes normal lasso as a MAP estimate.

Pro: It is easy to derive frequentist LASSO as a MAP estimate from this prior.

Con: Not actually sparse for non-MAP uses.

I have no need for this right now, but I did I might start with Dan Simpson’s critique.

🏗

Stan guy, Michael Betancourt introduces some issues with LASSO-type inference for Bayesians with a slant towards Horseshoe-type priors in preference spike and slab, possibly because hierarchical mixtures like spike-and-slab are not that easy in Stan, albeit possible.

Bhadra et al. (2016);Polson and Scott (2012);Schmidt and Makalic (2020);Xu et al. (2017).

Babacan, Luessi, Molina, et al. 2012. “Sparse Bayesian Methods for Low-Rank Matrix Estimation.” IEEE Transactions on Signal Processing.

Brodersen, Gallusser, Koehler, et al. 2015. “Inferring Causal Impact Using Bayesian Structural Time-Series Models.” The Annals of Applied Statistics.

Carvalho, Polson, and Scott. 2009. “Handling Sparsity via the Horseshoe.” In Artificial Intelligence and Statistics.

Castillo, Schmidt-Hieber, and van der Vaart. 2015. “Bayesian Linear Regression with Sparse Priors.” The Annals of Statistics.

George, and McCulloch. 1997. “Approaches for bayesian variable selection.” Statistica Sinica.

Mitchell, and Beauchamp. 1988. “Bayesian Variable Selection in Linear Regression.” Journal of the American Statistical Association.

Polson, and Scott. 2012. “Local Shrinkage Rules, Lévy Processes and Regularized Regression.” Journal of the Royal Statistical Society: Series B (Statistical Methodology).

Ročková, and George. 2018. “The Spike-and-Slab LASSO.” Journal of the American Statistical Association.

Schniter, Potter, and Ziniel. 2008. “Fast Bayesian Matching Pursuit.” In 2008 Information Theory and Applications Workshop.

Scott, and Varian. 2013. “Predicting the Present with Bayesian Structural Time Series.” SSRN Scholarly Paper ID 2304426.

Seeger, Steinke, and Tsuda. 2007. “Bayesian Inference and Optimal Design in the Sparse Linear Model.” In Artificial Intelligence and Statistics.

Smith, and Kohn. 1996. “Nonparametric Regression Using Bayesian Variable Selection.” Journal of Econometrics.

Titsias, and Lázaro-Gredilla. 2011. “Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning.” In Advances in Neural Information Processing Systems 24.

Zhou, Chen, Paisley, et al. 2009. “Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems. NIPS’09.