Hamiltonians, energy conservation in sampling. Handy. Summary would be nice.
Michael Betancourt’s heuristic explanation of Hamiltonian Monte Carlo: sets of high mass, no good - we need the “typical set”, a set whose product of differential volume and density is high. Motivates Markov Chain Monte Carlo on this basis, a way of exploring typical set given points already in it, or getting closer to the typical set if starting without. How to get a central limit theorem? “Geometric” ergodicity results. Hamiltonian Monte Carlo is a procedure for generating measure-preserving floes over phase space
\[H(q,p)=-\log(\pi(p|q)\pi(q))\] So my probability density gradient influences the particle momentum. And we can use symplectic integrators to walk through trajectories (if I knew more numerical quadrature I might know more about the benefits of this) in between random momentum perturbations. Some more stuff about resampling trajectories to de-bias numerical error, which is the NUTS extension to HMC.
Langevin Monte Carlo
Manifold Monte Carlo.
Betancourt, Michael. 2017. “A Conceptual Introduction to Hamiltonian Monte Carlo,” January. http://arxiv.org/abs/1701.02434.
———. 2018. “The Convergence of Markov Chain Monte Carlo Methods: From the Metropolis Method to Hamiltonian Monte Carlo.” Annalen Der Physik, March. https://doi.org/10.1002/andp.201700214.
Betancourt, Michael, Simon Byrne, Sam Livingstone, and Mark Girolami. 2017. “The Geometric Foundations of Hamiltonian Monte Carlo.” Bernoulli 23 (4A): 2257–98. https://doi.org/10.3150/16-BEJ810.
Carpenter, Bob, Matthew D. Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, and Michael Betancourt. 2015. “The Stan Math Library: Reverse-Mode Automatic Differentiation in C++.” arXiv Preprint arXiv:1509.07164. http://arxiv.org/abs/1509.07164.
Durmus, Alain, and Eric Moulines. 2016. “High-Dimensional Bayesian Inference via the Unadjusted Langevin Algorithm,” May. http://arxiv.org/abs/1605.01559.
Girolami, Mark, and Ben Calderhead. 2011. “Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (2): 123–214. https://doi.org/10.1111/j.1467-9868.2010.00765.x.
Goodrich, Ben, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Bob Carpenter, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. “Stan : A Probabilistic Programming Language.” Journal of Statistical Software 76 (1). https://doi.org/10.18637/jss.v076.i01.
Neal, Radford M. 2011. “MCMC Using Hamiltonian Dynamics.” In Handbook for Markov Chain Monte Carlo, edited by Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng. Boca Raton: Taylor & Francis. http://arxiv.org/abs/1206.1901.
Norton, Richard A., and Colin Fox. 2016. “Tuning of MCMC with Langevin, Hamiltonian, and Other Stochastic Autoregressive Proposals,” October. http://arxiv.org/abs/1610.00781.
Xifara, T., C. Sherlock, S. Livingstone, S. Byrne, and M. Girolami. 2014. “Langevin Diffusions and the Metropolis-Adjusted Langevin Algorithm.” Statistics & Probability Letters 91 (Supplement C): 14–19. https://doi.org/10.1016/j.spl.2014.04.002.