The Living Thing / Notebooks :

Deep learning as a dynamical system

Image credit: Donny Darko
Image credit: Donny Darko

A recurring movement within deep learning research which tries to render the learning of prediction functions tractable by considering them as dynamical systems, and using the theory of stability in the context of Hamiltonians,r optimal control and/or ODE solvers, to make it all work.

I’ve been interested by this ever since seeing the Haber and Ruthotto paper, but it’s got a real kick recently since the Vector Institute team’s paper won the prize at NeurIPS for learning the ODEs themselves.

Stability of training

Related, but not quite the same, notion of stability, as in data-stability in learning. Arguing that neural networks are in the limit approximants to quadrature solutions of certain ODES, work and gain insights and new tricks into neural nets by using ODE tricks.. This is mostly what Haber and Rhutthoto et al do. ([Haber, Ruthotto, Holtham, & Jun, 2017][#HRHJ17], [Haber, Lucka, & Ruthotto, 2018][#HaLR18], [Ruthotto & Haber, 2018][#RuHa18])

Can it work on time series?

Good question; It looks like it should, since there is an implicit time series the ODE-solver. But these problems so far have use non-time-series data.

Neural ODE regression

By which I mean learning an ODE whose solution is the regression problem. This is what the Vector Institute paper did. There are various laypersons’ introductions to this, including the simple and practical magical take in julia.

There are some syntheses of these approaches that try to do everything with ODEs, all the time. [Niu, Horesh, & Chuang, 2019][#NiHC19], [Rackauckas et al., 2018][#RMDG18], and even some tutorial implementations by the inexhaustible Chris Rackauckas.

Random stuff

My question: How can this be made Bayesian? Priors on dynamics, posterior uncertainties etc.

TBC. Lyapunov analysis, Hamiltonian dynamics.