# State filtering parameters

### Tracking things that don’t move

Usefulness: 🔧
Novelty: 💡
Uncertainty: 🤪 🤪 🤪
Incompleteness: 🚧 🚧 🚧

a.k.a. state space model calibration, recursive identification. Sometimes indistinguishable from online estimation.

State filters are cool for estimating time-varying hidden states given known fixed system parameters. How about learning those parameters of the model generating your states? Classic ways that you can do this in dynamical systems include basic linear system identification, and general system identification. But can you identify the fixed parameters (not just hidden states) with a state filter?

Yes.

According to (Lindström et al. 2012), here are some landmark papers:

Augmenting the unobserved state vector is a well known technique, used in the system identification community for decades, see e.g. Ljung (Ljung 1979; Lindström et al. 2008; Söderström and Stoica 1988). Similar ideas, using Sequential Monte Carlos methods, were suggested by (Kitagawa 1998; Liu and West 2001). Combined state and parameter estimation is also the standard technique for data assimilation in high-dimensional systems, see Moradkhani et al. (Geir Evensen 2009; G. Evensen 2009; Moradkhani et al. 2005)

However, introducing random walk dynamics to the parameters with fixed variance leads to a new dynamical stochastic system with properties that may be different from the properties of the original system. That implies that the variance of the random walk should be decreased, when the method is used for offline parameter estimation, cf. (Hürzeler and Künsch 2001).

🚧

## Iterated filtering

Related: indirect inference. Precise relation will have to wait, since I currently do not care enough about indirect inference.

## Questions

• Ionides and King dominate my citations, at least for the frequentist stuff. Surely other people use this method too? But what are the keywords? This research is suspiciously concentrated in U Michigan, but the idea is not so esoteric. I think I am caught in a citation bubble.

Update: the oceanographers, e.g. (Evensen 2003), seem to do this with Bayes a lot.

• a lot of the variational filtering literature turns out to be about attempting this with, effectively, neural nets.

• can I estimate regularisation this way, despite the lack of probabilistic interpretation? (leveraging Bayesian-prior parameter relations)

• How does this work with non-Markov systems? Do we need to bother, or can we just do the Hamiltonian trick and augment the state vector? Can we talk about mixing, or correlation decay? Should I then shoot for the new-wave mixing approaches of Kuznetsov and Mohri etc?

### Basic Construction

There are a few variations. We start with the basic continuous time state space model.

Here we have an unobserved Markov state process $$x(t)$$ on $$\mathcal{X}$$ and an observation process $$y(t)$$ on $$\mathcal{Y}$$. For now they will be assumed to be finite dimensional vectors over $$\mathbb{R}.$$ They will additionally depend upon a vector of parameters $$\theta$$ We observe the process at discrete times $$t(1:T)=(t_1, t_2,\dots, t_T),$$ and we will write the observations $$y(1:T)=(y(t_1), y(t_2),\dots, y(1_T)).$$

We presume our processes are completely specified by the following conditional densities (which might not have closed-form expression)

The transition density

$f(x(t_i)|x(t_{i-1}), \theta)$

The observation density…

TBC.

## Awaiting filing

Recently enjoyed: Sahani Pathiraja’s state filter does something cool, in attempting to identify process model noise – a conditional nonparametric density of process errors, that may be used to come up with some neat process models. I’m not convinced about her use of kernel density estimation, since these scale badly precisely when you need them most, in high dimension; but any nonparametric density estimator would, I assume, work, and that would be awesome.

## Implementations

pomp does state filtering inference in R.

For some example of doing this in Stan see Sinhrks’ stan-statespace.

# Refs

Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. “Particle Markov Chain Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72 (3): 269–342. https://doi.org/10.1111/j.1467-9868.2009.00736.x.

Archer, Evan, Il Memming Park, Lars Buesing, John Cunningham, and Liam Paninski. 2015. “Black Box Variational Inference for State Space Models,” November. http://arxiv.org/abs/1511.07367.

Babtie, Ann C., Paul Kirk, and Michael P. H. Stumpf. 2014. “Topological Sensitivity Analysis for Systems Biology.” Proceedings of the National Academy of Sciences 111 (52): 18507–12. https://doi.org/10.1073/pnas.1414026112.

Bamler, Robert, and Stephan Mandt. 2017. “Structured Black Box Variational Inference for Latent Time Series Models,” July. http://arxiv.org/abs/1707.01069.

Becker, Philipp, Harit Pandya, Gregor Gebhardt, Cheng Zhao, C. James Taylor, and Gerhard Neumann. 2019. “Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces.” In International Conference on Machine Learning, 544–52. http://proceedings.mlr.press/v97/becker19a.html.

Bretó, Carles, Daihai He, Edward L. Ionides, and Aaron A. King. 2009. “Time Series Analysis via Mechanistic Models.” The Annals of Applied Statistics 3 (1): 319–48. https://doi.org/10.1214/08-AOAS201.

Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz. 2016. “Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 113 (15): 3932–7. https://doi.org/10.1073/pnas.1517384113.

Chung, Junyoung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. “A Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28, edited by C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, 2980–8. Curran Associates, Inc. http://papers.nips.cc/paper/5653-a-recurrent-latent-variable-model-for-sequential-data.pdf.

Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. 2006. “Sequential Monte Carlo Samplers.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68 (3): 411–36. https://doi.org/10.1111/j.1467-9868.2006.00553.x.

———. 2011. “An Adaptive Sequential Monte Carlo Method for Approximate Bayesian Computation.” Statistics and Computing 22 (5): 1009–20. https://doi.org/10.1007/s11222-011-9271-y.

Doucet, Arnaud, Nando Freitas, and Neil Gordon. 2001. Sequential Monte Carlo Methods in Practice. New York, NY: Springer New York. http://public.eblib.com/choice/publicfullrecord.aspx?p=3087052.

Doucet, Arnaud, Pierre E. Jacob, and Sylvain Rubenthaler. 2013. “Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models,” April. http://arxiv.org/abs/1304.5768.

Drovandi, Christopher C., Anthony N. Pettitt, and Roy A. McCutchan. 2016. “Exact and Approximate Bayesian Inference for Low Integer-Valued Time Series Models with Intractable Likelihoods.” Bayesian Analysis 11 (2): 325–52. https://doi.org/10.1214/15-BA950.

Durbin, J., and S. J. Koopman. 2012. Time Series Analysis by State Space Methods. 2nd ed. Oxford Statistical Science Series 38. Oxford: Oxford University Press.

Evensen, G. 2009. “The Ensemble Kalman Filter for Combined State and Parameter Estimation.” IEEE Control Systems 29 (3): 83–104. https://doi.org/10.1109/MCS.2009.932223.

Evensen, Geir. 2009. Data Assimilation - the Ensemble Kalman Filter. Berlin; Heidelberg: Springer. http://link.springer.com/book/10.1007%2F978-3-642-03711-5.

———. 2003. “The Ensemble Kalman Filter: Theoretical Formulation and Practical Implementation.” Ocean Dynamics 53 (4): 343–67. https://doi.org/10.1007/s10236-003-0036-9.

Fearnhead, Paul, and Hans R. Künsch. 2018. “Particle Filters and Data Assimilation.” Annual Review of Statistics and Its Application 5 (1): 421–49. https://doi.org/10.1146/annurev-statistics-031017-100232.

He, Daihai, Edward L. Ionides, and Aaron A. King. 2010. “Plug-and-Play Inference for Disease Dynamics: Measles in Large and Small Populations as a Case Study.” Journal of the Royal Society Interface 7 (43): 271–83. https://doi.org/10.1098/rsif.2009.0151.

Heinonen, Markus, and Florence d’Alché-Buc. 2014. “Learning Nonparametric Differential Equations with Operator-Valued Kernels and Gradient Matching,” November. http://arxiv.org/abs/1411.5172.

Hürzeler, Markus, and Hans R. Künsch. 2001. “Approximating and Maximising the Likelihood for a General State-Space Model.” In Sequential Monte Carlo Methods in Practice, 159–75. Statistics for Engineering and Information Science. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-3437-9_8.

Ingraham, John, and Debora Marks. 2017. “Variational Inference for Sparse and Undirected Models.” In PMLR, 1607–16. http://proceedings.mlr.press/v70/ingraham17a.html.

Ionides, Edward L., Anindya Bhadra, Yves Atchadé, and Aaron King. 2011. “Iterated Filtering.” The Annals of Statistics 39 (3): 1776–1802. https://doi.org/10.1214/11-AOS886.

Ionides, Edward L., Dao Nguyen, Yves Atchadé, Stilian Stoev, and Aaron A. King. 2015. “Inference for Dynamic and Latent Variable Models via Iterated, Perturbed Bayes Maps.” Proceedings of the National Academy of Sciences 112 (3): 719–24. https://doi.org/10.1073/pnas.1410597112.

Ionides, E. L., C. Bretó, and A. A. King. 2006. “Inference for Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences 103 (49): 18438–43. https://doi.org/10.1073/pnas.0603181103.

Kantas, N., A. Doucet, S. S. Singh, and J. M. Maciejowski. 2009. “An Overview of Sequential Monte Carlo Methods for Parameter Estimation in General State-Space Models.” IFAC Proceedings Volumes, 15th IFAC Symposium on System Identification, 42 (10): 774–85. https://doi.org/10.3182/20090706-3-FR-2004.00129.

Kantas, Nikolas, Arnaud Doucet, Sumeetpal S. Singh, Jan Maciejowski, and Nicolas Chopin. 2015. “On Particle Methods for Parameter Estimation in State-Space Models.” Statistical Science 30 (3): 328–51. https://doi.org/10.1214/14-STS511.

Kitagawa, Genshiro. 1998. “A Self-Organizing State-Space Model.” Journal of the American Statistical Association, 1203–15. http://www.jstor.org/stable/2669862.

Krishnan, Rahul G., Uri Shalit, and David Sontag. 2015. “Deep Kalman Filters.” arXiv Preprint arXiv:1511.05121. https://arxiv.org/abs/1511.05121.

———. 2017. “Structured Inference Networks for Nonlinear State Space Models.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2101–9. http://arxiv.org/abs/1609.09869.

Le, Tuan Anh, Maximilian Igl, Tom Jin, Tom Rainforth, and Frank Wood. 2017. “Auto-Encoding Sequential Monte Carlo.” arXiv Preprint arXiv:1705.10306. https://arxiv.org/abs/1705.10306.

Lele, S. R., B. Dennis, and F. Lutscher. 2007. “Data Cloning: Easy Maximum Likelihood Estimation for Complex Ecological Models Using Bayesian Markov Chain Monte Carlo Methods.” Ecology Letters 10 (7): 551. https://doi.org/10.1111/j.1461-0248.2007.01047.x.

Lele, Subhash R., Khurram Nadeem, and Byron Schmuland. 2010. “Estimability and Likelihood Inference for Generalized Linear Mixed Models Using Data Cloning.” Journal of the American Statistical Association 105 (492): 1617–25. https://doi.org/10.1198/jasa.2010.tm09757.

Lindström, Erik, Edward Ionides, Jan Frydendall, and Henrik Madsen. 2012. “Efficient Iterated Filtering.” In IFAC-PapersOnLine (System Identification, Volume 16), 45:1785–90. 16th IFAC Symposium on System Identification. IFAC & Elsevier Ltd. https://doi.org/10.3182/20120711-3-BE-2027.00300.

Lindström, Erik, Jonas Ströjby, Mats Brodén, Magnus Wiktorsson, and Jan Holst. 2008. “Sequential Calibration of Options.” Computational Statistics & Data Analysis 52 (6): 2877–91. https://doi.org/10.1016/j.csda.2007.08.009.

Liu, Jane, and Mike West. 2001. “Combined Parameter and State Estimation in Simulation-Based Filtering.” In Sequential Monte Carlo Methods in Practice, 197–223. Statistics for Engineering and Information Science. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-3437-9_10.

Ljung, L. 1979. “Asymptotic Behavior of the Extended Kalman Filter as a Parameter Estimator for Linear Systems.” IEEE Transactions on Automatic Control 24 (1): 36–50. https://doi.org/10.1109/TAC.1979.1101943.

Ljung, Lennart, Georg Ch Pflug, and Harro Walk. 2012. Stochastic Approximation and Optimization of Random Systems. Vol. 17. Birkhäuser. https://books.google.ch/books?hl=en&lr=&id=9Fr2BwAAQBAJ&oi=fnd&pg=PA2&ots=rPS2wp3kUH&sig=UKiDTNaSjUnznmD9OUtipdRK7nY.

Ljung, Lennart, and Torsten Söderström. 1983. Theory and Practice of Recursive Identification. The MIT Press Series in Signal Processing, Optimization, and Control 4. Cambridge, Mass: MIT Press.

Maddison, Chris J., Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, and Yee Whye Teh. 2017. “Filtering Variational Objectives.” arXiv Preprint arXiv:1705.09279. https://arxiv.org/abs/1705.09279.

Moradkhani, Hamid, Soroosh Sorooshian, Hoshin V. Gupta, and Paul R. Houser. 2005. “Dual State–Parameter Estimation of Hydrological Models Using Ensemble Kalman Filter.” Advances in Water Resources 28 (2): 135–47. https://doi.org/10.1016/j.advwatres.2004.09.002.

Naesseth, Christian A., Scott W. Linderman, Rajesh Ranganath, and David M. Blei. 2017. “Variational Sequential Monte Carlo.” arXiv Preprint arXiv:1705.11140. https://arxiv.org/abs/1705.11140.

Oliva, Junier B., Barnabas Poczos, and Jeff Schneider. 2017. “The Statistical Recurrent Unit,” March. http://arxiv.org/abs/1703.00381.

Sjöberg, Jonas, Qinghua Zhang, Lennart Ljung, Albert Benveniste, Bernard Delyon, Pierre-Yves Glorennec, Håkan Hjalmarsson, and Anatoli Juditsky. 1995. “Nonlinear Black-Box Modeling in System Identification: A Unified Overview.” Automatica, Trends in System Identification, 31 (12): 1691–1724. https://doi.org/10.1016/0005-1098(95)00120-8.

Söderström, T., and P. Stoica, eds. 1988. System Identification. Upper Saddle River, NJ, USA: Prentice-Hall, Inc.

Tallec, Corentin, and Yann Ollivier. 2017. “Unbiasing Truncated Backpropagation Through Time,” May. http://arxiv.org/abs/1705.08209.

Tippett, Michael K., Jeffrey L. Anderson, Craig H. Bishop, Thomas M. Hamill, and Jeffrey S. Whitaker. 2003. “Ensemble Square Root Filters.” Monthly Weather Review 131 (7): 1485–90. http://iri.columbia.edu/~tippett/pubs/srf_revised_again_submit.pdf.

Werbos, Paul J. 1988. “Generalization of Backpropagation with Application to a Recurrent Gas Market Model.” Neural Networks 1 (4): 339–56. https://doi.org/10.1016/0893-6080(88)90007-X.