A placeholder for learning on curved spaces. Not discussed: learning OF curved spaces.

Also: learning where there is an *a priori* manifold seems to also be a usage here? See the work of, e.g. Nina Miolane and collaborators on the Geomstats project.

Girolami et al discuss Langevin Monte Carlo in this context.

The below headings may one day be filled in.

## Information Geometry

The unholy offspring of Fisher information and differential geometry, about which I know little except that it sounds like it should be intuitive. See also information criteria. I also know that even though this sounds intuitive, it is not mainstream and it has also not been especially useful to me even in places where it seemed that it should, at least not beyond the basic delta method.

## Hamiltonian Monte Carlo

You can also discuss Hamiltonian Monte Carlo in this setting. I will not.

## Natural gradient

See natural gradients.

## Homogeneous probability

Albert Tarantola’s framing, from his maybe forthcoming manuscript. How does it relate to information geometry? I don’t know yet. Haven’t had time to read. Also not a very common phrasing, which is a danger sign.

## To Read

- Divergence in everything: Cramér-Rao from data processing
- Azimuth’s Information Geometry Series plus the overview

## Refs

- CSPW10: Minhua Chen, J. Silva, J. Paisley, Chunping Wang, D. Dunson, L. Carin (2010) Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds.
*IEEE Transactions on Signal Processing*, 58(12), 6140–6155. DOI - BSAB14: Nicolas Boumal, Amit Singer, P.-A. Absil, Vincent D. Blondel (2014) Cramér-Rao bounds for synchronization of rotations.
*Information and Inference*, 3(1), 1–39. DOI - Barn87: O E Barndorff-Nielsen (1987) Differential and integral geometry in statistical inference. In Differential geometry in statistical inference. Sn Aarhus
- Amar87: Shunʼichi Amari (1987) Differential geometrical theory of statistics. In Differential geometry in statistical inference (pp. 19–94).
- FFPP13: J. L. Fernández-Martínez, Z. Fernández-Muñiz, J. L. G. Pallero, L. M. Pedruelo-González (2013) From Bayes to Tarantola: New insights to understand uncertainty in inverse problems.
*Journal of Applied Geophysics*, 98, 62–72. DOI - MMDJ18: Nina Miolane, Johan Mathe, Claire Donnat, Mikael Jorda, Xavier Pennec (2018) geomstats: a Python Package for Riemannian Geometry in Machine Learning.
*ArXiv:1805.08308 [Cs, Stat]*. - Amar01: Shunʼichi Amari (2001) Information geometry on hierarchy of probability distributions.
*IEEE Transactions on Information Theory*, 47, 1701–1711. DOI - XSLB14: T. Xifara, C. Sherlock, S. Livingstone, S. Byrne, M. Girolami (2014) Langevin diffusions and the Metropolis-adjusted Langevin algorithm.
*Statistics & Probability Letters*, 91(Supplement C), 14–19. DOI - MuWZ10: Sayan Mukherjee, Qiang Wu, Ding-Xuan Zhou (2010) Learning gradients on manifolds.
*Bernoulli*, 16(1), 181–207. DOI - HoSr15: Reshad Hosseini, Suvrit Sra (2015) Manifold Optimization for Gaussian Mixture Models.
*ArXiv Preprint ArXiv:1506.07677*. - BMAS14: Nicolas Boumal, Bamdev Mishra, P.-A. Absil, Rodolphe Sepulchre (2014) Manopt, a Matlab Toolbox for Optimization on Manifolds.
*Journal of Machine Learning Research*, 15, 1455–1459. - MoTa95: Klaus Mosegaard, Albert Tarantola (1995) Monte Carlo sampling of solutions to inverse problems.
*Journal of Geophysical Research*, 100(B7), 12431. - Amar98: Shun-ichi Amari (1998) Natural Gradient Works Efficiently in Learning.
*Neural Computation*, 10(2), 251–276. DOI - StHe09: Florian Steinke, Matthias Hein (2009) Non-parametric regression between manifolds. In Advances in Neural Information Processing Systems 21 (pp. 1561–1568). Curran Associates, Inc.
- Boum13: Nicolas Boumal (2013) On Intrinsic Cramér-Rao Bounds for Riemannian Submanifolds and Quotient Manifolds.
*IEEE Transactions on Signal Processing*, 61(7), 1809–1821. DOI - CISZ08: Gunnar Carlsson, Tigran Ishkhanov, Vin de Silva, Afra Zomorodian (2008) On the Local Behavior of Spaces of Natural Images.
*International Journal of Computer Vision*, 76(1), 1–12. DOI - GeMa17: Rong Ge, Tengyu Ma (2017) On the Optimization Landscape of Tensor Decompositions. In Advances In Neural Information Processing Systems.
- AbMS08: P.-A Absil, R Mahony, R Sepulchre (2008)
*Optimization algorithms on matrix manifolds*. Princeton, N.J.; Woodstock: Princeton University Press - Pete10: Jan Peters (2010) Policy gradient methods.
*Scholarpedia*, 5(11), 3698. DOI - ToKW16: James Townsend, Niklas Koep, Sebastian Weichwald (2016) Pymanopt: A Python Toolbox for Optimization on Manifolds using Automatic Differentiation.
*Journal of Machine Learning Research*, 17(137), 1–5. - AsBT11: Anil Aswani, Peter Bickel, Claire Tomlin (2011) Regression on manifolds: Estimation of the exterior derivative.
*The Annals of Statistics*, 39(1), 48–81. DOI - GiCa11: Mark Girolami, Ben Calderhead (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods.
*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*, 73(2), 123–214. DOI - Laur87: S L Lauritzen (1987) Statistical manifolds. In Differential geometry in statistical inference (p. 164). JSTOR
- BBLG17: Michael Betancourt, Simon Byrne, Sam Livingstone, Mark Girolami (2017) The geometric foundations of Hamiltonian Monte Carlo.
*Bernoulli*, 23(4A), 2257–2298. DOI - WaZh16: Yu Guang Wang, Xiaosheng Zhuang (2016) Tight framelets and fast framelet transforms on manifolds.
*ArXiv:1608.04026 [Math]*.