On the turbulent meeting of two disciplines with sometimes-opposing approaches.

Sample images of atmospheric rivers correctly classified (true positive) by our deep CNN model. Figure shows total column water vapor (color map) and land sea boundary (solid line).

In physics, typically, we are concerned with identifying True Parameters for Universal Laws, applicable without prejudice across all the cosmos. We are hunting something like the Platonic ideals that our experiments are poor shadows of. Especially, say, quantum physics or cosmology.

In machine learning, typically we want to make generic predictions for a given process, and quantify how good those predictions can be given how much data we have and the approximate kind of process we witness, and there is no notion of universal truth waiting around the corner to back up our wild fancies. On the other hand, we are less concerned about the noisy sublunary chaos of our experiments and don’t need to worry about how far our noise drives us from truth as long as we estimate it. But then, far from universality, we have weak and vague notions of how to generalise our models to new circumstances and new noise. That is, in the Platonic ideal of machine learning, there are no Platonic ideals to be found.

(This explanation does no justice to either physics or machine learning, but here is a mere framing rather than an essay in the history or philosophy of science.)

Can these areas have something to say to one another nevertheless? After an interesting conversation with Shane Keating about the difficulties of ocean dynamics, I am thinking about this in a new way; Generally, we might have notions from physics of what “truly” underlies a system, but where many unknown parameters, noisy measurements, computational intractability and complex or chaotic dynamics interfere with our ability to predict things using only known laws of physics; Here, we want to come up with a “best possible” stochastic model of a system given our uncertainties and constraints, which looks a lot like an ML problem.

At a basic level, it’s not controversial (I don’t think?) to use machine learning methods to analyse data in experiments, even with trendy deep neural networks (e.g. RKSB16, LRPC16). I understand that this is huge, e.g. in connectomics.

Perhaps a little more fringe is using machine learning to reduce computational burden, e.g. JKPK17, CaTr17.

The thing that is especially interesting to me is learning the whole model from ML formalism, using physical laws as input to the learning process.

To be concrete, Shane specifically was discussing problems in predicting and interpolating “tracers”, such as chemical or heat, in oceanographic flows. Here we know lots of things about the fluids concerned, but less about the details of the ocean floor and have very imperfect measurements of the details. Nonetheless, we also know that there are certain invariants, conservation laws etc, so a truly “nonparametric” approach to dynamics is certainly throwing away information. We know that our

There is some cute work in this area, like the *SINDy* method, a
compressive-sensing
state filter
of Brunton et al (BrPK16a, BrPK16b); but it’s hard to imagine
scaling this up (at least not directly) to big things like large image sensor arrays and other such weakly structured input.

Recently, though, Lotter et al (LoKC16) have shown that in fact prediction of whole sequences of images is possibly using neural nets, although their approach is not physics-driven. They get bonus points for making the code, prednet, downloadable. People like Chang et al (CUTT17) claim that learning “compositional object” models should be possible. The compositional models here are learnable objects with learnable pairwise interactions, and bear a passing resemblance to something like the physical laws that physics experiments hope to discover, although I’m not yet totally persuaded about the details of this particular framework. On the other hand, unmotivated appealing to autoencoders as descriptions of underlying dynamics of physical reality doesn’t seem sufficient.

There is an O’Reilly podcast and reflist about deep learning for science in particular.

## The other direction: What does physics say about learning on graphs?

Proceed with caution, since there is a lot of messy thinking here. Here are some things I’d like to read, but whose inclusion here should not be taken as a recommendation. The common theme is using ideas from physics to understand deep learning and other directed graph learning methods.

Charles H Martin. Why Deep Learning Works II: the Renormalization Group.

Max Tegmark, argues statistical mechanics provides inside to deep learning, and neuroscience (LiTR16, LiTe16)

Natalie Wolchover summarises Mehta and Schwab (MeSc14)

Wiatowski et al, (WiGB17) and Shwartz-Ziv, and Tishby (ShTi17) argue that looking at neural networks as random fields with energy propagation dynamics provides some insight to how they work.

There are other connections to - physics-driven annealing methods and physics-inpsired Boltzmann machines etc. TBC

## Addendum

Why not “statistics for physics”? No reason not to do that; but physics already uses lots of statistics, so we don’t need a new notebook for that. Moreover, there are certain interesting problems in statistics that seem to be especially obvious in physics-driven dynamical problems. The one that bothers me most is estimation theory for time-series data. I find it highly unsatisfactory that, in frequentist statistics, time series estimation results provide only asymptotic guarantees and even then, only for samples from realizations of stationary processes. ML theory, by contrast, has made progress on statistical learning theory for dependent data in the non-stationary case (McSS11, AlLW13, KuMo14).

OTOH, I might be able to get more mileage out of Bayesian time series methods? I’ve never looked into estimation theory there.

TBC

### Refs

- AlLW13
- Alquier, P., Li, X., & Wintenberger, O. (2013) Prediction of time series by statistical learning: general losses and fast rates.
*Dependence Modeling*, 1, 65–93. DOI. - Barb15
- Barbier, J. (2015) Statistical physics and approximate message-passing algorithms for sparse linear estimation problems in signal processing and coding theory.
*ArXiv:1511.01650 [Cs, Math]*. - BrPK16a
- Brunton, S. L., Proctor, J. L., & Kutz, J. N.(2016a) Discovering governing equations from data by sparse identification of nonlinear dynamical systems.
*Proceedings of the National Academy of Sciences*, 113(15), 3932–3937. DOI. - BrPK16b
- Brunton, S. L., Proctor, J. L., & Kutz, J. N.(2016b) Sparse Identification of Nonlinear Dynamics with Control (SINDYc).
*ArXiv:1605.06682 [Math]*. - CaTr17
- Carleo, G., & Troyer, M. (2017) Solving the quantum many-body problem with artificial neural networks.
*Science*, 355(6325), 602–606. DOI. - CUTT17
- Chang, M. B., Ullman, T., Torralba, A., & Tenenbaum, J. B.(2017) A Compositional Object-Based Approach to Learning Physical Dynamics. In Proceedings of ICLR.
- JKPK17
- Jonathan Tompson, Kristofer Schlachter, Pablo Sprechmann, & Ken Perlin. (2017) Accelerating Eulerian Fluid Simulation with Convolutional Networks. . Presented at the ICLR
- KuMo14
- Kuznetsov, V., & Mohri, M. (2014) Forecasting Non-Stationary Time Series: From Theory to Algorithms.
- LiTe16
- Lin, H. W., & Tegmark, M. (2016) Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language.
*ArXiv:1606.06737 [Cond-Mat]*. - LiTR16
- Lin, H. W., Tegmark, M., & Rolnick, D. (2016) Why does deep and cheap learning work so well?.
*ArXiv:1608.08225 [Cond-Mat, Stat]*. - LRPC16
- Liu, Y., Racah, E., Prabhat, Correa, J., Khosrowshahi, A., Lavers, D., … Collins, W. (2016) Application of Deep Convolutional Neural Networks for Detecting Extreme Weather in Climate Datasets.
*ArXiv:1605.01156 [Cs]*. - LoKC16
- Lotter, W., Kreiman, G., & Cox, D. (2016) Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning.
*ArXiv:1605.08104 [Cs, q-Bio]*. - McSS11
- McDonald, D. J., Shalizi, C. R., & Schervish, M. (2011) Generalization error bounds for stationary autoregressive models.
*ArXiv:1103.0942 [Cs, Stat]*. - MGDC16
- Medasani, B., Gamst, A., Ding, H., Chen, W., Persson, K. A., Asta, M., … Haranczyk, M. (2016) Predicting defect behavior in B2 intermetallics by merging ab initio modeling and machine learning.
*Npj Computational Materials*, 2(1), 1. DOI. - MeSc14
- Mehta, P., & Schwab, D. J.(2014) An exact mapping between the Variational Renormalization Group and Deep Learning.
*ArXiv:1410.3831 [Cond-Mat, Stat]*. - RKSB16
- Racah, E., Ko, S., Sadowski, P., Bhimji, W., Tull, C., Oh, S. Y., … Prabhat. (2016) Revealing Fundamental Physics from the Daya Bay Neutrino Experiment Using Deep Neural Networks. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 892–897). DOI.
- SDNM09
- Sargsyan, K., Debusschere, B., Najm, H., & Marzouk, Y. (2009) Bayesian Inference of Spectral Expansions for Predictability Assessment in Stochastic Reaction Networks.
*Journal of Computational and Theoretical Nanoscience*, 6(10), 2283–2297. DOI. - SCWY15
- Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W., & Woo, W. (2015) Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting.
*ArXiv:1506.04214 [Cs]*. - ShTi17
- Shwartz-Ziv, R., & Tishby, N. (2017) Opening the Black Box of Deep Neural Networks via Information.
*ArXiv:1703.00810 [Cs]*. - WiGB17
- Wiatowski, T., Grohs, P., & Bölcskei, H. (2017) Energy Propagation in Deep Convolutional Neural Networks.
*ArXiv:1704.03636 [Cs, Math, Stat]*.