Kalman-Bucy filter and variants, recursive estimation, predictive state models, Data assimilation. A particular sub-field of signal processing for models with hidden state.

In statistics terms, the state filters are a kind of online-updating hierarchical model for sequential observations of a dynamical system where the random state is unobserved, but you can get an optimal estimate of it based on incoming measurements and known parameters.

A unifying feature of all these is by assuming a sparse influence graph between observations and dynamics, that you can estimate behaviour using efficient message passing.

This is a twin problem to optimal control.

## Linear systems

In Kalman filters *per se* you are usually concerned with multivariate real vector signals representing different axes of some telemetry data problem.
In the degenerate case, where there is no observation noise, you can just
design a linear filter.

The classic Kalman filter (Kalm60) assumes a linear model with Gaussian noise, although it might work with not-quite Gaussian, not-quite linear models if you prod it. You can extend this flavour to somewhat more general dynamics.

If you are doing telemetry then you probably know *a priori* that your model is not linear in this case, and extensions are advisable.

(NB I’m conflating linear observation and linear process models here, but this is fine for a link list, I think.)

## Non-linear dynamical systems

Cute exercise: you can derive the analytic Kalman filter for any noise and process dynamics of with Bayesian conjugate, and this leads to filters of nonlinear behaviour. Multivariate distributions are a bit of a mess for non-Gaussians, though, and a beta-Kalman filter feels contrived.

Upshot is, the non-linear extensions don’t usually rely on non-Gaussian conjugate distributions and analytic forms, but rather do some Gaussian/linear approximation, or use randomised methods such as particle filters.

For some example of doing this in Stan see Sinhrks’ statn-statespace.

## Discrete state Hidden Markov models

TBD. Viterbi algorithm.

## Variational state filters

See Variatioanl state filters.

## Other interesting state filters

Note that state filters can also do approximate gaussian process regression, apparently. See Särkka’s work.

## State filter inference

How about learning the *parameters* of the model generating your states?
Ways that you can do this in dynamical systems include
basic
linear system identification,
general system identification, .
But can you identify the parameters (not just hidden states) with a state filter?
Yes, see recursive estimation.

## Refs

- Robe11: Andrew N. Robertson (2011) A Bayesian approach to drum tracking.
- BeZa76: A. J. Berkhout, P. R. Zaanen (1976) A Comparison Between Wiener Filtering, Kalman Filtering, and Deterministic Least Squares Estimation*.
*Geophysical Prospecting*, 24(1), 141–197. DOI - Bilm98: Jeff A. Bilmes (1998) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models.
*International Computer Science Institute*, 4(510), 126. - JuUD95: S.J. Julier, J.K. Uhlmann, H.F. Durrant-Whyte (1995) A new approach for filtering nonlinear systems. In American Control Conference, Proceedings of the 1995 (Vol. 3, pp. 1628–1632 vol.3). DOI
- Kalm60: R. E. Kalman (1960) A New Approach to Linear Filtering and Prediction Problems.
*Journal of Basic Engineering*, 82(1), 35. DOI - HeDG15: Ahmed Hefny, Carlton Downey, Geoffrey Gordon (2015) A New View of Predictive State Methods for Dynamical System Learning.
*ArXiv:1505.05310 [Cs, Stat]*. - CKDG15: Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, Yoshua Bengio (2015) A Recurrent Latent Variable Model for Sequential Data. In Advances in Neural Information Processing Systems 28 (pp. 2980–2988). Curran Associates, Inc.
- John12: Matthew James Johnson (2012) A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models.
*ArXiv:1204.2477 [Cs, Stat]*. - HsKZ12: Daniel Hsu, Sham M. Kakade, Tong Zhang (2012) A spectral algorithm for learning Hidden Markov Models.
*Journal of Computer and System Sciences*, 78(5), 1460–1480. DOI - Rabi89: L.R. Rabiner (1989) A tutorial on hidden Markov models and selected applications in speech recognition.
*Proceedings of the IEEE*, 77(2), 257–286. DOI - AMGC02: M. S. Arulampalam, S. Maskell, N. Gordon, T. Clapp (2002) A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking.
*IEEE Transactions on Signal Processing*, 50(2), 174–188. DOI - Kail74: T. Kailath (1974) A view of three decades of linear filtering theory.
*IEEE Transactions on Information Theory*, 20(2), 146–181. DOI - KMBT11: Nicholas Kalouptsidis, Gerasimos Mileounis, Behtash Babadi, Vahid Tarokh (2011) Adaptive algorithms for sparse system identification.
*Signal Processing*, 91(8), 1910–1919. DOI - KaGe71: T. Kailath, R. Geesey (1971) An innovations approach to least squares estimation–Part IV: Recursive estimation given lumped covariance functions.
*IEEE Transactions on Automatic Control*, 16(6), 720–727. DOI - KaGe73: T. Kailath, R. Geesey (1973) An innovations approach to least-squares estimation–Part V: Innovations representations and recursive estimation in colored noise.
*IEEE Transactions on Automatic Control*, 18(5), 435–453. DOI - GeKa73: M. Gevers, T. Kailath (1973) An innovations approach to least-squares estimation–Part VI: Discrete-time innovations representations and recursive estimation.
*IEEE Transactions on Automatic Control*, 18(6), 588–600. DOI - AaKa73: H. Aasnaes, T. Kailath (1973) An innovations approach to least-squares estimation–Part VII: Some applications of vector autoregressive-moving average models.
*IEEE Transactions on Automatic Control*, 18(6), 601–607. DOI - RaJu86: L. Rabiner, B.H. Juang (1986) An introduction to hidden Markov models.
*IEEE ASSP Magazine*, 3(1), 4–16. DOI - CoKo07: Jacques J. F. Commandeur, Siem Jan Koopman (2007)
*An Introduction to State Space Time Series Analysis*. Oxford ; New York: Oxford University Press - KaDu72: T. Kailath, D. Duttweiler (1972) An RKHS approach to detection and estimation problems– III: Generalized innovations representations and a likelihood-ratio formula.
*IEEE Transactions on Information Theory*, 18(6), 730–745. DOI - KaWe75: T. Kailath, H. Weinert (1975) An RKHS approach to detection and estimation problems–II: Gaussian signal detection.
*IEEE Transactions on Information Theory*, 21(1), 15–23. DOI - Perr10: T.S. Perry (2010) Andrew Viterbi’s fabulous formula [Medal of Honor].
*IEEE Spectrum*, 47(5), 47–50. DOI - BaRO17: Philipp Batz, Andreas Ruttor, Manfred Opper (2017) Approximate Bayes learning of stochastic differential equations.
*ArXiv:1702.05390 [Physics, Stat]*. - KSSP15: Anna Kutschireiter, Simone C Surace, Henning Sprekeler, Jean-Pascal Pfister (2015) Approximate nonlinear filtering with a recurrent neural network.
*BMC Neuroscience*, 16(Suppl 1), P196. DOI - FiSi16: Axel Finke, Sumeetpal S. Singh (2016) Approximate Smoothing and Parameter Estimation in High-Dimensional State-Space Models.
*ArXiv:1606.08650 [Stat]*. - KaSä16: Toni Karvonen, Simo Särkkä (2016) Approximate state-space Gaussian processes via spectral transformation.
- LjKa76: L. Ljung, T. Kailath (1976) Backwards Markovian models for second-order stochastic processes (Corresp).
*IEEE Transactions on Information Theory*, 22(4), 488–491. DOI - Särk13: Simo Särkkä (2013)
*Bayesian filtering and smoothing*. Cambridge, U.K.; New York: Cambridge University Press - RoPl07: Andrew Robertson, Mark Plumbley (2007) B-Keeper: A Beat-tracker for Live Performance. In Proceedings of the 7th International Conference on New Interfaces for Musical Expression (pp. 234–237). New York, NY, USA: ACM DOI
- RoRu09: Alejandro Rodriguez, Esther Ruiz (2009) Bootstrap prediction intervals in state–space models.
*Journal of Time Series Analysis*, 30(2), 167–178. DOI - CaRS15: Ben Cassidy, Caroline Rae, Victor Solo (2015) Brain Activity: Connectivity, Sparsity, and Mutual Information.
*IEEE Transactions on Medical Imaging*, 34(4), 846–860. DOI - CrHu99: Noel Cressie, Hsin-Cheng Huang (1999) Classes of Nonseparable, Spatio-Temporal Stationary Covariance Functions.
*Journal of the American Statistical Association*, 94(448), 1330–1339. DOI - LeBS09: Jing Lei, Peter Bickel, Chris Snyder (2009) Comparison of Ensemble Kalman Filters under Non-Gaussianity.
*Monthly Weather Review*, 138(4), 1293–1306. DOI - Carm14: Avishy Y. Carmi (2014) Compressive System Identification. In Compressed Sensing & Sparse Filtering (pp. 281–324). Springer Berlin Heidelberg DOI
- Carm13: Avishy Y. Carmi (2013) Compressive system identification: sequential methods and entropy bounds.
*Digital Signal Processing*, 23(3), 751–770. DOI - BaSa13: Heather Battey, Alessio Sancetta (2013) Conditional estimation for dependent functional data.
*Journal of Multivariate Analysis*, 120, 1–17. DOI - EdAn15: David Edwards, Smitha Ankinakatte (2015) Context-specific graphical models for discrete longitudinal data.
*Statistical Modelling*, 15(4), 301–325. DOI - DoJR13: Arnaud Doucet, Pierre E. Jacob, Sylvain Rubenthaler (2013) Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models.
*ArXiv:1304.5768 [Stat]*. - TaPa16: Shahin Tavakoli, Victor M. Panaretos (2016) Detecting and Localizing Differences in Functional Time Series Dynamics: A Case Study in Molecular Biophysics.
*Journal of the American Statistical Association*, 1–31. DOI - Alli92: S. Alliney (1992) Digital filters as absolute norm regularizers.
*IEEE Transactions on Signal Processing*, 40(6), 1548–1562. DOI - BrPK16: Steven L. Brunton, Joshua L. Proctor, J. Nathan Kutz (2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems.
*Proceedings of the National Academy of Sciences*, 113(15), 3932–3937. DOI - Olfa05: R. Olfati-Saber (2005) Distributed Kalman Filter with Embedded Consensus Filters. In 44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05 (pp. 8179–8184). Seville, Spain: IEEE DOI
- EFBS04: U Eden, L Frank, R Barbieri, V Solo, E Brown (2004) Dynamic Analysis of Neural Encoding by Point Process Adaptive Filtering.
*Neural Computation*, 16(5), 971–998. DOI - ChBR16: Adam Charles, Aurele Balavoine, Christopher Rozell (2016) Dynamic Filtering of Time-Varying Sparse Signals via l1 Minimization.
*IEEE Transactions on Signal Processing*, 64(21), 5644–5656. DOI - AnKo85: Craig F. Ansley, Robert Kohn (1985) Estimation, Filtering, and Smoothing in State Space Models with Incompletely Specified Initial Conditions.
*The Annals of Statistics*, 13(4), 1286–1316. DOI - PPRS15: Alexandre Papadopoulos, François Pachet, Pierre Roy, Jason Sakellariou (2015) Exact Sampling for Regular and Markov Constraints with Belief Propagation. In Principles and Practice of Constraint Programming (pp. 341–350). Switzerland: Springer, Cham DOI
- KoDu00: S. J. Koopman, J. Durbin (2000) Fast Filtering and Smoothing for Multivariate State Space Models.
*Journal of Time Series Analysis*, 21(3), 281–296. DOI - GoJa15: Christian Gourieroux, Joann Jasiak (2015) Filtering, Prediction and Simulation Methods for Noncausal Processes.
*Journal of Time Series Analysis*, n/a-n/a. DOI - HaLu14: Andrew Harvey, Alessandra Luati (2014) Filtering With Heavy Tails.
*Journal of the American Statistical Association*, 109(507), 1112–1122. DOI - CrSK10: Noel CRESSIE, Tao SHI, Emily L. KANG (2010) Fixed Rank Filtering for Spatio-Temporal Data.
*Journal of Computational and Graphical Statistics*, 19(3), 724–745. - HsSc11: Roger Hsiao, Tanja Schultz (2011) Generalized Baum-Welch Algorithm and Its Implication to a New Extended Baum-Welch Algorithm. In In Proceedings of INTERSPEECH.
- Eddy96: Sean R Eddy (1996) Hidden Markov models.
*Current Opinion in Structural Biology*, 6(3), 361–365. DOI - Fras08: Andrew M. Fraser (2008)
*Hidden Markov models and dynamical systems*. Philadelphia, PA: Society for Industrial and Applied Mathematics - WiBC98: Christopher K. Wikle, L. Mark Berliner, Noel Cressie (1998) Hierarchical Bayesian space-time models.
*Environmental and Ecological Statistics*, 5(2), 117–154. DOI - IoBK06: E. L. Ionides, C. Bretó, A. A. King (2006) Inference for nonlinear dynamical systems.
*Proceedings of the National Academy of Sciences*, 103(49), 18438–18443. DOI - SäHa12: Simo Särkkä, Jouni Hartikainen (2012) Infinite-Dimensional Kalman Filtering Approach to Spatio-Temporal Gaussian Process Regression. In Journal of Machine Learning Research.
- IBAK11: Edward L. Ionides, Anindya Bhadra, Yves Atchadé, Aaron King (2011) Iterated filtering.
*The Annals of Statistics*, 39(3), 1776–1802. DOI - MaKP98: J. H. Manton, V. Krishnamurthy, H. V. Poor (1998) James-Stein state filtering algorithms.
*IEEE Transactions on Signal Processing*, 46(9), 2431–2447. DOI - HaSä10: J. Hartikainen, S. Särkkä (2010) Kalman filtering and smoothing solutions to temporal Gaussian process regression models. In 2010 IEEE International Workshop on Machine Learning for Signal Processing (pp. 379–384). DOI
- LeMT09: François Le Gland, Valerie Monbet, Vu-Duc Tran (2009) Large sample asymptotics for the ensemble Kalman filter. , 25.
- Sore70: H.W. Sorenson (1970) Least-squares estimation: from Gauss to Kalman.
*IEEE Spectrum*, 7(7), 63–68. DOI - CaFe08: Simon Cauchemez, Neil M. Ferguson (2008) Likelihood-based estimation of continuous-time epidemic models from time-series data: application to measles transmission in London.
*Journal of The Royal Society Interface*, 5(25), 885–897. DOI - HMCH08: X. Hong, R. J. Mitchell, S. Chen, C. J. Harris, K. Li, G. W. Irwin (2008) Model selection approaches for non-linear system identification: a review.
*International Journal of Systems Science*, 39(10), 925–946. DOI - Kita96: Genshiro Kitagawa (1996) Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models.
*Journal of Computational and Graphical Statistics*, 5(1), 1–25. DOI - ThLa98: Sebastian Thrun, John Langford (1998) Monte carlo hidden markov models. DTIC Document
- ThLF99: Sebastian Thrun, John Langford, Dieter Fox (1999) Monte carlo hidden markov models: Learning non-parametric models of partially observable stochastic processes. In Proceedings of the International Conference on Machine Learning. Bled, Slovenia
- DuKo97: J. Durbin, S. J. Koopman (1997) Monte Carlo maximum likelihood estimation for non-Gaussian state space models.
*Biometrika*, 84(3), 669–684. DOI - HaLR18: Eldad Haber, Felix Lucka, Lars Ruthotto (2018) Never look back - A modified EnKF method and its application to the training of neural networks without back propagation.
*ArXiv:1805.08034 [Cs, Math]*. - Kita87: Genshiro Kitagawa (1987) Non-Gaussian State—Space Modeling of Nonstationary Time Series.
*Journal of the American Statistical Association*, 82(400), 1032–1041. DOI - SeDK75: A. Segall, M. Davis, T. Kailath (1975) Nonlinear filtering with counting observations.
*IEEE Transactions on Information Theory*, 21(2), 143–149. DOI - SäHa13: S. Särkkä, J. Hartikainen (2013) Non-linear noise adaptive Kalman filtering via variational Bayes. In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (pp. 1–6). DOI
- Kalm59: R. Kalman (1959) On the general theory of control systems.
*IRE Transactions on Automatic Control*, 4(3), 110–110. DOI - DeKT17: P. Del Moral, A. Kurtzmann, J. Tugaut (2017) On the Stability and the Uniform Propagation of Chaos of a Class of Extended Ensemble Kalman—Bucy Filters.
*SIAM Journal on Control and Optimization*, 55(1), 119–155. DOI - BiDe16: Adrian N. Bishop, Pierre Del Moral (2016) On the stability of Kalman-Bucy diffusion processes.
*ArXiv:1610.04686 [Math]*. - Särk07: Simo Särkkä (2007) On Unscented Kalman Filtering for State Estimation of Continuous-Time Nonlinear Systems.
*IEEE Transactions on Automatic Control*, 52(9), 1631–1641. DOI - SuPf16: Simone Carlo Surace, Jean-Pascal Pfister (2016) Online Maximum Likelihood Estimation of the Parameters of Partially Observed Diffusion Processes.
- Olli17: Yann Ollivier (2017) Online Natural Gradient as a Kalman Filter.
*ArXiv:1703.00209 [Math, Stat]*. - FeKü18: Paul Fearnhead, Hans R. Künsch (2018) Particle Filters and Data Assimilation.
*Annual Review of Statistics and Its Application*, 5(1), 421–449. DOI - HoLH16: Elizabeth Hou, Earl Lawrence, Alfred O. Hero (2016) Penalized Ensemble Kalman Filters for High Dimensional Non-linear Systems.
*ArXiv:1610.00195 [Physics, Stat]*. - StMu13: Nicolas Städler, Sach Mukherjee (2013) Penalized estimation in high-dimensional hidden Markov models with state-specific graphical models.
*The Annals of Applied Statistics*, 7(4), 2157–2179. DOI - MiOl00: Charles A. Micchelli, Peder Olsen (2000) Penalized maximum-likelihood estimation, the Baum–Welch algorithm, diagonal balancing of symmetric matrices and applications to training acoustic data.
*Journal of Computational and Applied Mathematics*, 119(1–2), 301–331. DOI - RoSD13: Andrew Robertson, Adam Stark, Matthew EP Davies (2013) Percussive beat tracking using real-time median filtering. In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.
- BiDP17: Adrian N. Bishop, Pierre Del Moral, Sahani D. Pathiraja (2017) Perturbations and Projections of Kalman-Bucy Semigroups Motivated by Methods in Data Assimilation.
*ArXiv:1701.05978 [Math]*. - HeIK10: Daihai He, Edward L. Ionides, Aaron A. King (2010) Plug-and-play inference for disease dynamics: measles in large and small populations as a case study.
*Journal of The Royal Society Interface*, 7(43), 271–283. DOI - ScWZ16: Aaron Schein, Hanna Wallach, Mingyuan Zhou (2016) Poisson-Gamma dynamical systems. In Advances In Neural Information Processing Systems (pp. 5006–5014).
- ClBj04: James S. Clark, Ottar N. Bjørnstad (2004) Population time series: process variability, observation errors, missing values, lags, and hidden states.
*Ecology*, 85(11), 3140–3150. DOI - KoMT11: Hisashi Kobayashi, Brian L. Mark, William Turin (2011)
*Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance*. Cambridge University Press - RoSP11: Andrew Robertson, Adam M. Stark, Mark D. Plumbley (2011) Real-time visual beat tracking using a comb filter matrix. In Proceedings of the International Computer Music Conference 2011.
- ChHe12: Y. Chen, A. O. Hero (2012) Recursive ℓ1,∞ Group Lasso.
*IEEE Transactions on Signal Processing*, 60(8), 3978–3987. DOI - SäNu09: Simo Särkkä, A. Nummenmaa (2009) Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations.
*IEEE Transactions on Automatic Control*, 54(3), 596–600. DOI - Kail71: T. Kailath (1971) RKHS approach to detection and estimation problems–I: Deterministic signals in Gaussian noise.
*IEEE Transactions on Information Theory*, 17(5), 530–549. DOI - DuKa73a: D. Duttweiler, T. Kailath (1973a) RKHS approach to detection and estimation problems–IV: Non-Gaussian detection.
*IEEE Transactions on Information Theory*, 19(1), 19–28. DOI - DuKa73b: D. Duttweiler, T. Kailath (1973b) RKHS approach to detection and estimation problems–V: Parameter estimation.
*IEEE Transactions on Information Theory*, 19(1), 29–37. DOI - LjKF75: L. Ljung, T. Kailath, B. Friedlander (1975) Scattering theory and linear least squares estimation: Part I: Continuous-time problems. In 1975 IEEE Conference on Decision and Control including the 14th Symposium on Adaptive Processes (pp. 55–56). DOI
- FrKL75: B. Friedlander, T. Kailath, L. Ljung (1975) Scattering theory and linear least squares estimation: Part II: Discrete-time problems. In 1975 IEEE Conference on Decision and Control including the 14th Symposium on Adaptive Processes (pp. 57–58). DOI
- FSPW16: Marco Fraccaro, Sø ren Kaae Sø nderby, Ulrich Paquet, Ole Winther (2016) Sequential Neural Models with Stochastic Layers. In Advances in Neural Information Processing Systems 29 (pp. 2199–2207). Curran Associates, Inc.
- KiGe96: Genshiro Kitagawa, Will Gersch (1996)
*Smoothness Priors Analysis of Time Series*. New York, NY: Springer New York : Imprint : Springer - CrWi06: Noel Cressie, Christopher K. Wikle (2006) Space-Time Kalman Filter. In Encyclopedia of Environmetrics. John Wiley & Sons, Ltd
- SäSH13: Simo Särkkä, A. Solin, J. Hartikainen (2013) Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.
*IEEE Signal Processing Magazine*, 30(4), 51–61. DOI - KaCr11: Matthias Katzfuss, Noel Cressie (2011) Spatio-temporal smoothing and EM estimation for massive remote-sensing data sets.
*Journal of Time Series Analysis*, 32(4), 430–446. DOI - CrWi15: Noel Cressie, Christopher K. Wikle (2015)
*Statistics for Spatio-Temporal Data*. John Wiley & Sons - HaKo05: A. Harvey, S. J. Koopman (2005) Structural Time Series Models. In Encyclopedia of Biostatistics. John Wiley & Sons, Ltd
- KrSS17: Rahul G. Krishnan, Uri Shalit, David Sontag (2017) Structured Inference Networks for Nonlinear State Space Models. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (pp. 2101–2109).
- ChHo12: Bin Chen, Yongmiao Hong (2012) Testing for the Markov Property in Time Series.
*Econometric Theory*, 28(01), 130–178. DOI - Psia13: M. Psiaki (2013) The blind tricyclist problem and a comparative study of nonlinear filters: A challenging benchmark for evaluating nonlinear estimation methods.
*IEEE Control Systems*, 33(3), 40–54. DOI - Levi17: David N. Levin (2017) The Inner Structure of Time-Dependent Signals.
*ArXiv:1703.08596 [Cs, Math, Stat]*. - DuKo12: J. Durbin, S. J. Koopman (2012)
*Time series analysis by state space methods*. Oxford: Oxford University Press - BHIK09: Carles Bretó, Daihai He, Edward L. Ionides, Aaron A. King (2009) Time series analysis via mechanistic models.
*The Annals of Applied Statistics*, 3(1), 319–348. DOI - Fris08: K. J. Friston (2008) Variational filtering.
*NeuroImage*, 41(3), 747–766. DOI - KeLS14: D. T. B. Kelly, K. J. H. Law, A. M. Stuart (2014) Well-posedness and accuracy of the ensemble Kalman filter in discrete and continuous time.
*Nonlinearity*, 27(10), 2579. DOI