The Living Thing / Notebooks :

Free energy

Learning in Markov random fields

A formalism for learning and inference in Markov random fields with connections to statistical mechanics, message passing, graphical models.

I’m mostly reading about Markov random fields at the moment, so I’ll discuss it in those terms, although by the Hammerly-Clifford Theorem, this is the same as any system which can be factorised into physics-style potential energy functions in a particular (broad) sense. Here (and generally?) one can construct free energy as a lower bound of the log partition function (need to unpack that), which turns out to be neither so useless nor so obscure as it sounds.

To Read

  • Adams, S., Collevecchio, A., & König, W. (2011). A variational formula for the free energy of an interacting many-particle system. The Annals of Probability, 39(2), 683–728. Online.
  • Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends® in Machine Learning, 2(1), 1–127. DOI. Online.
  • Frey, B. J., & Jojic, N. (2005). A comparison of algorithms for inference and learning in probabilistic graphical models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9), 1392–1416. DOI. Online.
  • Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K.(1999). An Introduction to Variational Methods for Graphical Models. Machine Learning, 37(2), 183–233. DOI. Online.
  • Jordan, M. I., & Weiss, Y. (2002). Probabilistic inference in graphical models. Handbook of Neural Networks and Brain Theory. Online.
  • LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., & Huang, F. (2006). A tutorial on energy-based learning. Predicting Structured Data. Online.
  • Montanari, A. (2011). Lecture Notes for Stat 375 Inference in Graphical Models. Online.
  • Wainwright, M. J., & Jordan, M. I.(2008). Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1-2), 1–305. DOI. Online.
  • Wainwright, M., & Jordan, M. (2005). A variational principle for graphical models. In New Directions in Statistical Signal Processing (Vol. 155). MIT Press. Online.
  • Wang, C., Komodakis, N., & Paragios, N. (2013). Markov Random Field modeling, inference & learning in computer vision & image understanding: A survey. Computer Vision and Image Understanding, 117(11), 1610–1627. DOI. Online.
  • Xing, E. P., Jordan, M. I., & Russell, S. (2012). A Generalized Mean Field Algorithm for Variational Inference in Exponential Families. arXiv:1212.2512 [cs, Stat]. Online.
  • Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2003). Understanding Belief Propagation and Its Generalizations. In G. Lakemeyer & B. Nebel (Eds.), Exploring Artificial Intelligence in the New Millennium (pp. 239–236). Morgan Kaufmann Publishers. Online.
  • Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2005). Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 51(7), 2282–2312. DOI.

Is this also some kind of universal learning scheme?

Not “free as in speech” or “free as in beer”, nor “free energy” in the sense of perpetual motion machines, zero point energy or pills that turn your water into petroleum.

My question is - how much more credible than these latter examples is the “free energy principle” as a unifying principle for learning systems?

The chief pusher of this wheelbarrow appears to be UCL’s Karl Friston.

He starts his latest Nature Reviews Neuroscience with this statement of the principle:

The free-energy principle says that any self-organizing system that is at equilibrium with its environment must minimize its free energy.

Is that “must” in

  1. the sense of moral obligation, or is it
  2. a testable conservation law of some kind?

If the latter, self-organising in what sense? What class of equilibrium? For which definition of the free energy? What is our chief experimental evidence for this hypothesis? Rather than a no-nonsense unpacking of these, the article goes on to meander through an ocean fashionable other stuff (The Bayesian Brain Hypothesis) which I have not yet trawled for salient details, so I don’t realy know about that at this point.

Fortunately we do get a definition of free energy itself, with a diagram, which

…shows the dependencies among the quantities that define free energy. These include the internal states of the brain \(\mu(t)\) and quantities describing its exchange with the environment: sensory signals (and their motion) \(\bar{s}(t) = [s,s',s''\ldots ]^T\) plus action \(a(t)\). The environment is described by equations of motion, which specify the trajectory of its hidden states. The causes \(\vartheta \supset {\bar{x}, \theta, \gamma }\) of sensory input comprise hidden states \(\bar{x} (t)\), parameters \(\theta\), and precisions \(\gamma\) controlling the amplitude of the random fluctuations \(\bar{z}(t)\) and \(\bar{w}(t)\). Internal brain states and action minimize free energy \(F(\bar{s}, \mu)\), which is a function of sensory input and a probabilistic representation \(q(\vartheta|\mu)\) of its causes. This representation is called the recognition density and is encoded by internal states \(\mu\).

The free energy depends on two probability densities: the recognition density \(q(\vartheta|\mu)\) and one that generates sensory samples and their causes, \(p(\bar{s},\vartheta|m)\). The latter represents a probabilistic generative model (denoted by m), the form of which is entailed by the agent or brain…

\begin{equation*} F = -<\ln p(\bar{s},\vartheta|m)>_q + -<\ln q(\vartheta|\mu)>_q \end{equation*}

This, on the other hand, seems to be option 1. Any right thinking brain, seeking to avoid the vice of slothful and decadent perception after the manner of heathens, foreigners, and compulsive masturbators, would do well to seek to maximise its free energy before partaking of a stimulating and refreshing physical recreation such as a game of cricket.

Presumably as I drill deeper, this will be related back to other definitions, precise types of energies employed etc, the predictions to be made and experiments done, though.

See also: Exergy, Landauer’s Principle.

To Read

  • Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127. DOI. Online.
  • Friston, K. (2013). Life as we know it. Journal of The Royal Society Interface, 10(86). DOI. Online.
  • Friston, K., & Friston, K. (2010). Is the free-energy principle neurocentric? Nature Reviews Neuroscience, 11(8), 605. DOI. Online.
  • Yukalov, V. I., & Sornette, D. (2014). Self-organization in complex systems as decision making. Advances in Complex Systems, 17(03n04), 1450016. DOI. Online.