The Living Thing / Notebooks :

Reparameterisation tricks in differentiable inference

A.k.a. Normalizing flows

A trick I see variational inference for probabilistic deep learning, best summarised as “fancy change of variables”. Looks like learning of manifolds, sorta.


Ingmar Shuster summary

The paper adopts the term normalizing flow for refering to the plain old change of variables formula for integrals. With the minor change of view that one can see this as a flow and the correct but slightly alien reference to a flow defined by the Langevin SDE or Fokker-Planck, both attributed only to ML/stats literature in the paper.

The theoretical contribution feels a little like a strawman: it simply states that, as Langevin and Hamiltonian dynamics can be seen as an infinitesimal normalizing flow, and both approximate the posterior when the step size goes to zero, normalizing flows can approximate the posterior arbitrarily well. This is of course nothing that was derived in the paper, nor is it news. Nor does it say anything about the practical approach suggested.

The invertible maps suggested have practical merit however, as they allow “splitting” of a mode into two, called the planar transformation (and plotted on the right of the image), as well as “attracting/repulsing” probability mass around a point. The Jacobian correction for both invertible maps being computable in time that is linear in the number of dimensions.

Rui Shu explains change of variables in probability and shows how it induces the normalizing flow idea. PyMC3 example of a non-trivial example. Adam Kosiorek summarises some fancy variants. Eric Jang did a tutorial.