Bayesian consistency
Life is short. You want to use some tasty tool, such as a hierarchical model without anyone getting cross at you for apostasy? Why not use whatever estimator works, and then show that it works on both frequentist and Bayesian grounds?

There is a basic result here, due to Doob, which essentially says that the Bayesian learner is consistent, except on a set of data of prior probability zero. That is, the Bayesian is subjectively certain they will converge on the truth. This is not as reassuring as one might wish, and showing Bayesian consistency under the true distribution is harder. In fact, it usually involves assumptions under which nonBayes procedures will also converge. […]
Concentration of the posterior around the truth is only a preliminary. One would also want to know that, say, the posterior mean converges, or even better that the predictive distribution converges. For many finitedimensional problems, what’s called the “Bernsteinvon Mises theorem” basically says that the posterior mean and the maximum likelihood estimate converge, so if one works the other will too. This breaks down for infinitedimensional problems.
Regularisation and priors
An excellent answer by Tymoteusz Wołodźko must be in the running for punchiest summary ever, made precise by Andrew Milne.
Question: What do nonconvex regularizers look like in a Bayesian context, and are they an argument for Bayesian sampling from the posterior rather than the frequntist’s NPhard optimum search? And what does, e.g. the GJPS08’s recommended alternative Cauchy prior look like?
Refs
Bayesian consistency
Life is short. You want to use some tasty tool, such as a hierarchical model without anyone getting cross at you for apostasy? Why not use whatever estimator works, and then show that it works on both frequentist and Bayesian grounds?

There is a basic result here, due to Doob, which essentially says that the Bayesian learner is consistent, except on a set of data of prior probability zero. That is, the Bayesian is subjectively certain they will converge on the truth. This is not as reassuring as one might wish, and showing Bayesian consistency under the true distribution is harder. In fact, it usually involves assumptions under which nonBayes procedures will also converge. […]
Concentration of the posterior around the truth is only a preliminary. One would also want to know that, say, the posterior mean converges, or even better that the predictive distribution converges. For many finitedimensional problems, what’s called the “Bernsteinvon Mises theorem” basically says that the posterior mean and the maximum likelihood estimate converge, so if one works the other will too. This breaks down for infinitedimensional problems.
Regularisation and priors
An excellent answer by Tymoteusz Wołodźko must be in the running for punchiest summary ever, made precise by Andrew Milne.
Question: What do nonconvex regularizers look like in a Bayesian context, and are they an argument for Bayesian sampling from the posterior rather than the frequntist’s NPhard optimum search? And what does, e.g. the GJPS08’s recommended alternative Cauchy prior look like?
Refs
 GJPS08: Andrew Gelman, Aleks Jakulin, Maria Grazia Pittau, YuSung Su (2008) A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2(4), 1360–1383. DOI
 Auma76: Robert J. Aumann (1976) Agreeing to Disagree. The Annals of Statistics, 4(6), 1236–1239.
 AdGa16: Madhu Advani, Surya Ganguli (2016) An equivalence between high dimensional Bayes optimal inference and Mestimation. In Advances In Neural Information Processing Systems.
 Doob49: J. L. Doob (1949) Application of the theory of martingales. In Le Calcul des Probabilités et ses Applications (pp. 23–27). Centre National de la Recherche Scientifique, Paris
 Efro12: Bradley Efron (2012) Bayesian inference and the parametric bootstrap. The Annals of Applied Statistics, 6(4), 1971–1997. DOI
 LeDL07: S. R. Lele, B. Dennis, F. Lutscher (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecology Letters, 10(7), 551. DOI
 Nick14: Richard Nickl (2014) Discussion of: ‘Frequentist coverage of adaptive nonparametric Bayesian credible sets’’.’ ArXiv:1410.7600 [Math, Stat].
 Shal09: Cosma Rohilla Shalizi (2009) Dynamics of Bayesian updating with dependent data and misspecified models. Electronic Journal of Statistics, 3, 1039–1074. DOI
 LeNS10: Subhash R. Lele, Khurram Nadeem, Byron Schmuland (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. Journal of the American Statistical Association, 105(492), 1617–1625. DOI
 Efro15: Bradley Efron (2015) Frequentist accuracy of Bayesian estimates. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 77(3), 617–646. DOI
 Valp11: Perry de Valpine (2011) Frequentist analysis of hierarchical models for population dynamics and demographic data. Journal of Ornithology, 152(2), 393–408. DOI
 WaBl17: Yixin Wang, David M. Blei (2017) Frequentist Consistency of Variational Bayes. ArXiv:1705.03439 [Cs, Math, Stat].
 SzVZ13: Botond Szabó, Aad van der Vaart, Harry van Zanten (2013) Frequentist coverage of adaptive nonparametric Bayesian credible sets. ArXiv:1310.4489 [Math, Stat].
 DiFr86: Persi Diaconis, David Freedman (1986) On the Consistency of Bayes Estimates. The Annals of Statistics, 14(1), 1–26.
 Rous16: Judith Rousseau (2016) On the Frequentist Properties of Bayesian Nonparametric Methods. Annual Review of Statistics and Its Application, 3(1), 211–231. DOI
 Tibs96: Robert Tibshirani (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) , 58(1), 267–288.
 Aaro05: Scott Aaronson (2005) The complexity of agreement. In Proceedings of the thirtyseventh annual ACM symposium on Theory of computing (p. 634). ACM Press DOI
 Nort84: Robert M. Norton (1984) The Double Exponential Distribution: Using Calculus to Find a Maximum Likelihood Estimator. The American Statistician, 38(2), 135–136. DOI
 BaBe04: M. J. Bayarri, J. O. Berger (2004) The Interplay of Bayesian and Frequentist Analysis. Statistical Science, 19(1), 58–80. DOI
 Sims10: C. Sims (2010) Understanding nonbayesians. Unpublished Chapter, Department of Economics, Princeton University.
 Free99: David Freedman (1999) Wald Lecture: On the Bernsteinvon Mises theorem with infinitedimensional parameters. The Annals of Statistics, 27(4), 1119–1141. DOI