Boaz Barak has a miniature dictionary for statisticians:
I’ve always been curious about the statistical physics approach to problems from computer science. The physics-inspired algorithm survey propagation is the current champion for random 3SAT instances, statistical-physics phase transitions have been suggested as explaining computational difficulty, and statistical physics has even been invoked to explain why deep learning algorithms seem to often converge to useful local minima.
Unfortunately, I have always found the terminology of statistical physics, “spin glasses”, “quenched averages”, “annealing”, “replica symmetry breaking”, “metastable states” etc.. to be rather daunting
Phase transitions in statistical inference
Cristopher Moore says
There is a deep analogy between statistical inference and statistical physics; I will give a friendly introduction to both of these fields. I will then discuss phase transitions in two problems of interest to a broad range of data sciences: community detection in social and biological networks, and clustering of sparse high-dimensional data. In both cases, if our data becomes too sparse or too noisy, it suddenly becomes impossible to find the underlying pattern, or even tell if there is one. Physics both helps us locate these phase transitions, and design optimal algorithms that succeed all the way up to this point. Along the way, I will visit ideas from computational complexity, random graphs, random matrices, and spin glass theory.
There is an overview lecture by Thomas Orton, which cites lots of the good stuff
Last week, we saw how certain computational problems like 3SAT exhibit a thresholding behavior, similar to a phase transition in a physical system. In this post, we’ll continue to look at this phenomenon by exploring a heuristic method, belief propagation (and the cavity method), which has been used to make hardness conjectures, and also has thresholding properties. In particular, we’ll start by looking at belief propagation for approximate inference on sparse graphs as a purely computational problem. After doing this, we’ll switch perspectives and see belief propagation motivated in terms of Gibbs free energy minimization for physical systems. With these two perspectives in mind, we’ll then try to use belief propagation to do inference on the the stochastic block model. We’ll see some heuristic techniques for determining when BP succeeds and fails in inference, as well as some numerical simulation results of belief propagation for this problem. Lastly, we’ll talk about where this all fits into what is currently known about efficient algorithms and information theoretic barriers for the stochastic block model.
See Igor Carron’s “phase diagram” list, and stuff like this Weng et al “Overcoming The Limitations of Phase Transition by Higher Order Analysis of Regularization Techniques” (WeMZ16) or possibly OyTr15. Likely there are connections to Erdős-Renyi giant components and other complex network things in probabilisitic graph learning. Read Barb15, PLRS16, WeMZ16.
Replicator equations and evolutionary processes
See also evolution, game theory.
Gentle intro lecture by John Baez, Biology as Information Dynamics.
See Baez11, Harp09a, Harp09b, Shal09, SiLi96.
- AcCo08: Dimitris Achlioptas, Amin Coja-Oghlan (2008) Algorithmic barriers from phase transitions. ArXiv:0803.2122 [Math], 793–802. DOI
- MeSc14: Pankaj Mehta, David J. Schwab (2014) An exact mapping between the Variational Renormalization Group and Deep Learning. ArXiv:1410.3831 [Cond-Mat, Stat].
- Comp90: (1990) Complexity, Entropy and the Physics of Information: The Proceedings of the 1988 Workshop on Complexity, Entropy, and the Physics of Information. . Addison-Wesley Pub. Co.
- LiTe16a: Henry W. Lin, Max Tegmark (2016a) Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language. ArXiv:1606.06737 [Cond-Mat].
- RuHa18: Lars Ruthotto, Eldad Haber (2018) Deep Neural Networks motivated by Partial Differential Equations. ArXiv:1804.04272 [Cs, Math, Stat].
- Shal09: Cosma Rohilla Shalizi (2009) Dynamics of Bayesian updating with dependent data and misspecified models. Electronic Journal of Statistics, 3, 1039–1074. DOI
- PLRS16: Ben Poole, Subhaneil Lahiri, Maithreyi Raghu, Jascha Sohl-Dickstein, Surya Ganguli (2016) Exponential expressivity in deep neural networks through transient chaos. In Advances in Neural Information Processing Systems 29 (pp. 3360–3368). Curran Associates, Inc.
- ShTi17: Ravid Shwartz-Ziv, Naftali Tishby (2017) Opening the Black Box of Deep Neural Networks via Information. ArXiv:1703.00810 [Cs].
- BKMM17: Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová (2017) Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models. ArXiv:1708.03395 [Cond-Mat, Physics:Math-Ph].
- Wolp08: David H. Wolpert (2008) Physical limits of inference. Physica D: Nonlinear Phenomena, 237(9), 1257–1281. DOI
- Baez11: John C. Baez (2011) Renyi Entropy and Free Energy.
- CMHR18: Bo Chang, Lili Meng, Eldad Haber, Lars Ruthotto, David Begert, Elliot Holtham (2018) Reversible Architectures for Arbitrarily Deep Residual Neural Networks. In arXiv:1709.03698 [cs, stat].
- CaCa05: Tommaso Castellani, Andrea Cavagna (2005) Spin-Glass Theory for Pedestrians. Journal of Statistical Mechanics: Theory and Experiment, 2005(05), P05012. DOI
- HaRu18: Eldad Haber, Lars Ruthotto (2018) Stable architectures for deep neural networks. Inverse Problems, 34(1), 014004. DOI
- Barb15: Jean Barbier (2015) Statistical physics and approximate message-passing algorithms for sparse linear estimation problems in signal processing and coding theory. ArXiv:1511.01650 [Cs, Math].
- ZdKr16: Lenka Zdeborová, Florent Krzakala (2016) Statistical physics of inference: Thresholds and algorithms. Advances in Physics, 65(5), 453–552. DOI
- BrMZ02: A. Braunstein, M. Mezard, R. Zecchina (2002) Survey propagation: an algorithm for satisfiability. ArXiv:Cs/0212002.
- Moor17: Cristopher Moore (2017) The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness. Bulletin of the EATCS.
- SzRi17: Gábor J. Székely, Maria L. Rizzo (2017) The Energy of Data. Annual Review of Statistics and Its Application, 4(1), 447–479. DOI
- CHMB15: Anna Choromanska, MIkael Henaff, Michael Mathieu, Gerard Ben Arous, Yann LeCun (2015) The Loss Surfaces of Multilayer Networks. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (pp. 192–204).
- Harp09: Marc Harper (2009) The Replicator Equation as an Inference Dynamic.
- SiLi96: B. Sinervo, C. M. Lively (1996) The rock–paper–scissors game and the evolution of alternative male strategies. Nature, 380(6571), 240. DOI
- Mand62: Benoit Mandelbrot (1962) The Role of Sufficiency and of Estimation in Thermodynamics. The Annals of Mathematical Statistics, 33(3), 1021–1038. DOI
- OyTr15: Samet Oymak, Joel A. Tropp (2015) Universality laws for randomized dimension reduction, with applications. ArXiv:1511.09433 [Cs, Math, Stat].
- BBCI16: Carlo Baldassi, Christian Borgs, Jennifer T. Chayes, Alessandro Ingrosso, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina (2016) Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes. Proceedings of the National Academy of Sciences, 113(48), E7655–E7662. DOI
- LiTe16b: Henry W. Lin, Max Tegmark (2016b) Why does deep and cheap learning work so well? ArXiv:1608.08225 [Cond-Mat, Stat].