State filtering for hidden Markov models

Kalman and friends

June 22, 2015 — May 24, 2023

Bayes

dynamical systems

linear algebra

probability

signal processing

state space models

statistics

time series

Kalman-Bucy filter and variants, recursive estimation, predictive state models, Data assimilation. A particular sub-field of signal processing for models with hidden state.

In statistics terms, the state filters are a kind of online-updating hierarchical model for sequential observations of a dynamical system where the random state is unobserved, but you can get an optimal estimate of it based on incoming measurements and known parameters.

A unifying feature of all these is by assuming a sparse influence graph between observations and dynamics, that you can estimate behaviour using efficient message passing.

This is a twin problem to optimal control. If I wish to tackle this problem from the perspective of observations rather than true state, perhaps I could do it from the perspective of Koopman operators.

1 Linear dynamical systems

In Kalman filters per se the default problem is usually concerned with multivariate real vector signals representing different axes of some telemetry data. In the degenerate case, where there is no observation noise, we can just design a linear filter which solves the target problem.

The classic Kalman filter (R. E. Kalman 1960) assumes a linear model with Gaussian noise, although it might work with not-quite Gaussian, not-quite linear models if you prod it. You can extend this flavour to somewhat more general dynamics. For that, see later.

NB I’m conflating linear observation and linear process models, for now. We can relax that when there are some concrete examples in play.

There are a large number of equivalent formulations of the Kalman filter. The notation of Fearnhead and Künsch (2018) is representative. They start from the usual state filter setting: The state process \(\left(\mathbf{X}_{t}\right)\) is assumed to be Markovian and the \(i\)-th observation, \(\mathbf{Y}_{i}\), depends only on the state at time \(i, \mathbf{X}_{i}\), so that the evolution and observation variates are defined by \[ \begin{aligned} \mathbf{X}_{t} \mid\left(\mathbf{x}_{0: t-1}, \mathbf{y}_{1: t-1}\right) & \sim P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right), \quad \mathbf{X}_{0} \sim \pi_{0}\left(d \mathbf{x}_{0}\right) \\ \mathbf{Y}_{t} \mid\left(\mathbf{x}_{0: t}, \mathbf{y}_{1: t-1}\right) & \sim g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right) d \nu\left(\mathbf{y}_{t}\right) \end{aligned} \] with joint distribution \[ \left(\mathbf{X}_{0: s}, \mathbf{Y}_{1: t}\right) \sim \pi_{0}\left(d \mathbf{x}_{0}\right) \prod_{i=1}^{s} P\left(d \mathbf{x}_{i} \mid \mathbf{x}_{i-1}\right) \prod_{j=1}^{t} g\left(\mathbf{y}_{j} \mid \mathbf{x}_{j}\right) \nu\left(d \mathbf{y}_{j}\right), \quad s \geq t. \]

Integrating out the path of the state process, we obtain that \[\begin{aligned} \mathbf{Y}_{1: t} &\sim p\left(\mathbf{y}_{1: t}\right) \prod_{j} \nu\left(d \mathbf{y}_{j}\right)\text{, where}\\ p\left(\mathbf{y}_{1: t}\right) &=\int \pi_{0}\left(d \mathbf{x}_{0}\right) \prod_{i=1}^{s} P\left(d \mathbf{x}_{i} \mid \mathbf{x}_{i-1}\right) \prod_{j=1}^{t} g\left(\mathbf{y}_{j} \mid \mathbf{x}_{j}\right). \end{aligned} \] We wish to find the distribution \(\pi_{0: s \mid t}=\frac{p(\mathbf{y}_{1: t},\mathbf{x}_{0:s})}{p(\mathbf{y}_{1: t})}\) (by Bayes’ rule). We deduce the recursion \[ \begin{aligned} \pi_{0: t \mid t-1}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t-1}\right) &=\pi_{0: t-1 \mid t-1}\left(d \mathbf{x}_{0: t-1} \mid \mathbf{y}_{1: t-1}\right) P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right) &\text{ prediction}\\ \pi_{0: t \mid t}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t}\right) &=\pi_{0: t \mid t-1}\left(d \mathbf{x}_{0: t} \mid \mathbf{y}_{1: t-1}\right) \frac{g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right)}{p\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)} &\text{ correction} \end{aligned} \] where \[ p\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)=\frac{p\left(\mathbf{y}_{1: t}\right)}{p\left(\mathbf{y}_{1: t-1}\right)}=\int \pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right) . \] Integrating out all but the latest states \(\mathbf{x}_{0: t-1}\) gives us the one-step recursion \[ \begin{aligned} \pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) &=\int \pi_{t-1}\left(d \mathbf{x}_{t-1} \mid \mathbf{y}_{1: t-1}\right) P\left(d \mathbf{x}_{t} \mid \mathbf{x}_{t-1}\right) &\text{ prediction}\\ \pi_{t}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t}\right) &=\pi_{t \mid t-1}\left(d \mathbf{x}_{t} \mid \mathbf{y}_{1: t-1}\right) \frac{g\left(\mathbf{y}_{t} \mid \mathbf{x}_{t}\right)}{p_{t}\left(\mathbf{y}_{t} \mid \mathbf{y}_{1: t-1}\right)}&\text{ correction} \end{aligned} \]

If we approximate the filter distribution \(\pi_t\) with a Monte Carlo sample, we are doing particle filtering, which Fearnhead and Künsch (2018) refer to as bootstrap filtering.

TODO: implied Kalman gain etc.

2 Non-linear dynamical systems

Cute exercise: you can derive the analytic Kalman filter for any noise and process dynamics of with Bayesian conjugate, and this leads to filters of nonlinear behaviour. Multivariate distributions are a bit of a mess for non-Gaussians, though, and a beta-Kalman filter feels contrived.

Upshot is, the non-linear extensions don’t usually rely on non-Gaussian conjugate distributions and analytic forms, but rather do some Gaussian/linear approximation, or use randomised methods such as particle filters.

For some examples in Stan see Sinhrks’ stan-statespace.

3 As errors-in-variables models

see, e.g. Bagge Carlson (2018).

4 Discrete state Hidden Markov models

🏗 Viterbi algorithm.

5 Unscented Kalman filter

i.e. using the unscented transform.

6 Variational state filters

See variational state filters.

7 Kalman filtering Gaussian processes

See filtering Gaussian processes.

8 Ensemble Kalman filters

See Ensemble Kalman filters.

9 State filter inference

How about learning the parameters of the model generating your states? Ways that you can do this in dynamical systems include basic linear system identification, general system identification, .

10 References

Aasnaes, and Kailath. 1973. “An Innovations Approach to Least-Squares Estimation–Part VII: Some Applications of Vector Autoregressive-Moving Average Models.” IEEE Transactions on Automatic Control.

Alliney. 1992. “Digital Filters as Absolute Norm Regularizers.” IEEE Transactions on Signal Processing.

Alzraiee, White, Knowling, et al. 2022. “A Scalable Model-Independent Iterative Data Assimilation Tool for Sequential and Batch Estimation of High Dimensional Model Parameters and States.” Environmental Modelling & Software.

Ansley, and Kohn. 1985. “Estimation, Filtering, and Smoothing in State Space Models with Incompletely Specified Initial Conditions.” The Annals of Statistics.

Arulampalam, Maskell, Gordon, et al. 2002. “A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking.” IEEE Transactions on Signal Processing.

Bagge Carlson. 2018. “Machine Learning and System Identification for Estimation in Physical Systems.”

Battey, and Sancetta. 2013. “Conditional Estimation for Dependent Functional Data.” Journal of Multivariate Analysis.

Batz, Ruttor, and Opper. 2017. “Approximate Bayes Learning of Stochastic Differential Equations.” arXiv:1702.05390 [Physics, Stat].

Becker, Pandya, Gebhardt, et al. 2019. “Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces.” In International Conference on Machine Learning.

Berkhout, and Zaanen. 1976. “A Comparison Between Wiener Filtering, Kalman Filtering, and Deterministic Least Squares Estimation*.” Geophysical Prospecting.

Bilmes. 1998. “A Gentle Tutorial of the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models.” International Computer Science Institute.

Bishop, and Del Moral. 2016. “On the Stability of Kalman-Bucy Diffusion Processes.” SIAM Journal on Control and Optimization.

———. 2023. “On the Mathematical Theory of Ensemble (Linear-Gaussian) Kalman-Bucy Filtering.” Mathematics of Control, Signals, and Systems.

Bishop, Del Moral, and Pathiraja. 2017. “Perturbations and Projections of Kalman-Bucy Semigroups Motivated by Methods in Data Assimilation.” arXiv:1701.05978 [Math].

Bretó, He, Ionides, et al. 2009. “Time Series Analysis via Mechanistic Models.” The Annals of Applied Statistics.

Brunton, Proctor, and Kutz. 2016. “Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences.

Campbell, Shi, Rainforth, et al. 2021. “Online Variational Filtering and Parameter Learning.” In.

Carmi. 2013. “Compressive System Identification: Sequential Methods and Entropy Bounds.” Digital Signal Processing.

———. 2014. “Compressive System Identification.” In Compressed Sensing & Sparse Filtering. Signals and Communication Technology.

Cassidy, Rae, and Solo. 2015. “Brain Activity: Connectivity, Sparsity, and Mutual Information.” IEEE Transactions on Medical Imaging.

Cauchemez, and Ferguson. 2008. “Likelihood-Based Estimation of Continuous-Time Epidemic Models from Time-Series Data: Application to Measles Transmission in London.” Journal of The Royal Society Interface.

Charles, Balavoine, and Rozell. 2016. “Dynamic Filtering of Time-Varying Sparse Signals via L1 Minimization.” IEEE Transactions on Signal Processing.

Chen, Y., and Hero. 2012. “Recursive ℓ1,∞ Group Lasso.” IEEE Transactions on Signal Processing.

Chen, Bin, and Hong. 2012. “Testing for the Markov Property in Time Series.” Econometric Theory.

Chung, Kastner, Dinh, et al. 2015. “A Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28.

Clark, and Bjørnstad. 2004. “Population Time Series: Process Variability, Observation Errors, Missing Values, Lags, and Hidden States.” Ecology.

Commandeur, and Koopman. 2007. An Introduction to State Space Time Series Analysis.

Cox, van de Laar, and de Vries. 2019. “A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms.” International Journal of Approximate Reasoning.

Cressie, and Huang. 1999. “Classes of Nonseparable, Spatio-Temporal Stationary Covariance Functions.” Journal of the American Statistical Association.

Cressie, Shi, and Kang. 2010. “Fixed Rank Filtering for Spatio-Temporal Data.” Journal of Computational and Graphical Statistics.

Cressie, and Wikle. 2011. Statistics for Spatio-Temporal Data. Wiley Series in Probability and Statistics 2.0.

Freitas, João FG de, Doucet, Niranjan, et al. 1998. “Global Optimisation of Neural Network Models via Sequential Sampling.” In Proceedings of the 11th International Conference on Neural Information Processing Systems. NIPS’98.

Freitas, J. F. G. de, Niranjan, Gee, et al. 1998. “Sequential Monte Carlo Methods for Optimisation of Neural Network Models.” Cambridge University Engineering Department, Cambridge, England, Technical Report TR-328.

Deisenroth, and Mohamed. 2012. “Expectation Propagation in Gaussian Process Dynamical Systems.” In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 2. NIPS’12.

Del Moral, Kurtzmann, and Tugaut. 2017. “On the Stability and the Uniform Propagation of Chaos of a Class of Extended Ensemble Kalman-Bucy Filters.” SIAM Journal on Control and Optimization.

Doucet, Jacob, and Rubenthaler. 2013. “Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models.” arXiv:1304.5768 [Stat].

Durbin, and Koopman. 1997. “Monte Carlo Maximum Likelihood Estimation for Non-Gaussian State Space Models.” Biometrika.

———. 2012. Time Series Analysis by State Space Methods. Oxford Statistical Science Series 38.

Duttweiler, and Kailath. 1973a. “RKHS Approach to Detection and Estimation Problems–IV: Non-Gaussian Detection.” IEEE Transactions on Information Theory.

———. 1973b. “RKHS Approach to Detection and Estimation Problems–V: Parameter Estimation.” IEEE Transactions on Information Theory.

Easley, and Berry. 2020. “A Higher Order Unscented Transform.” arXiv:2006.13429 [Cs, Math].

Eddy. 1996. “Hidden Markov Models.” Current Opinion in Structural Biology.

Eden, Frank, Barbieri, et al. 2004. “Dynamic Analysis of Neural Encoding by Point Process Adaptive Filtering.” Neural Computation.

Edwards, and Ankinakatte. 2015. “Context-Specific Graphical Models for Discrete Longitudinal Data.” Statistical Modelling.

Eleftheriadis, Nicholson, Deisenroth, et al. 2017. “Identification of Gaussian Process State Space Models.” In Advances in Neural Information Processing Systems 30.

Fearnhead, and Künsch. 2018. “Particle Filters and Data Assimilation.” Annual Review of Statistics and Its Application.

Finke, and Singh. 2016. “Approximate Smoothing and Parameter Estimation in High-Dimensional State-Space Models.” arXiv:1606.08650 [Stat].

Föll, Haasdonk, Hanselmann, et al. 2017. “Deep Recurrent Gaussian Process with Variational Sparse Spectrum Approximation.” arXiv:1711.00799 [Stat].

Fraccaro, Sø nderby, Paquet, et al. 2016. “Sequential Neural Models with Stochastic Layers.” In Advances in Neural Information Processing Systems 29.

Fraser. 2008. Hidden Markov Models and Dynamical Systems.

Friedlander, Kailath, and Ljung. 1975. “Scattering Theory and Linear Least Squares Estimation: Part II: Discrete-Time Problems.” In 1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes.

Frigola, Chen, and Rasmussen. 2014. “Variational Gaussian Process State-Space Models.” In Advances in Neural Information Processing Systems 27.

Frigola, Lindsten, Schön, et al. 2013. “Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC.” In Advances in Neural Information Processing Systems 26.

Friston. 2008. “Variational Filtering.” NeuroImage.

Gevers, and Kailath. 1973. “An Innovations Approach to Least-Squares Estimation–Part VI: Discrete-Time Innovations Representations and Recursive Estimation.” IEEE Transactions on Automatic Control.

Gorad, Zhao, and Särkkä. 2020. “Parameter Estimation in Non-Linear State-Space Models by Automatic Differentiation of Non-Linear Kalman Filters.” In.

Gottwald, and Reich. 2020. “Supervised Learning from Noisy Observations: Combining Machine-Learning Techniques with Data Assimilation.” arXiv:2007.07383 [Physics, Stat].

Gourieroux, and Jasiak. 2015. “Filtering, Prediction and Simulation Methods for Noncausal Processes.” Journal of Time Series Analysis.

Gu, Johnson, Goel, et al. 2021. “Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers.” In Advances in Neural Information Processing Systems.

Haber, Lucka, and Ruthotto. 2018. “Never Look Back - A Modified EnKF Method and Its Application to the Training of Neural Networks Without Back Propagation.” arXiv:1805.08034 [Cs, Math].

Hamilton, Berry, and Sauer. 2016. “Kalman-Takens Filtering in the Presence of Dynamical Noise.” arXiv:1611.05414 [Physics, Stat].

Hartikainen, and Särkkä. 2010. “Kalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models.” In 2010 IEEE International Workshop on Machine Learning for Signal Processing.

Harvey, A., and Koopman. 2005. “Structural Time Series Models.” In Encyclopedia of Biostatistics.

Harvey, Andrew, and Luati. 2014. “Filtering With Heavy Tails.” Journal of the American Statistical Association.

Hefny, Downey, and Gordon. 2015. “A New View of Predictive State Methods for Dynamical System Learning.” arXiv:1505.05310 [Cs, Stat].

He, Ionides, and King. 2010. “Plug-and-Play Inference for Disease Dynamics: Measles in Large and Small Populations as a Case Study.” Journal of The Royal Society Interface.

Hong, Mitchell, Chen, et al. 2008. “Model Selection Approaches for Non-Linear System Identification: A Review.” International Journal of Systems Science.

Hou, Lawrence, and Hero. 2016. “Penalized Ensemble Kalman Filters for High Dimensional Non-Linear Systems.” arXiv:1610.00195 [Physics, Stat].

Hsiao, and Schultz. 2011. “Generalized Baum-Welch Algorithm and Its Implication to a New Extended Baum-Welch Algorithm.” In In Proceedings of INTERSPEECH.

Hsu, Kakade, and Zhang. 2012. “A Spectral Algorithm for Learning Hidden Markov Models.” Journal of Computer and System Sciences, JCSS Special Issue: Cloud Computing 2011,.

Huber. 2014. “Recursive Gaussian Process: On-Line Regression and Learning.” Pattern Recognition Letters.

Ionides, Edward L., Bhadra, Atchadé, et al. 2011. “Iterated Filtering.” The Annals of Statistics.

Ionides, E. L., Bretó, and King. 2006. “Inference for Nonlinear Dynamical Systems.” Proceedings of the National Academy of Sciences.

Johansen, Doucet, and Davy. 2006. “Sequential Monte Carlo for Marginal Optimisation Problems.” Scis & Isis.

Johnson. 2012. “A Simple Explanation of A Spectral Algorithm for Learning Hidden Markov Models.” arXiv:1204.2477 [Cs, Stat].

Julier, Uhlmann, and Durrant-Whyte. 1995. “A New Approach for Filtering Nonlinear Systems.” In American Control Conference, Proceedings of the 1995.

Kailath. 1971. “RKHS Approach to Detection and Estimation Problems–I: Deterministic Signals in Gaussian Noise.” IEEE Transactions on Information Theory.

———. 1974. “A View of Three Decades of Linear Filtering Theory.” IEEE Transactions on Information Theory.

Kailath, and Duttweiler. 1972. “An RKHS Approach to Detection and Estimation Problems– III: Generalized Innovations Representations and a Likelihood-Ratio Formula.” IEEE Transactions on Information Theory.

Kailath, and Geesey. 1971. “An Innovations Approach to Least Squares Estimation–Part IV: Recursive Estimation Given Lumped Covariance Functions.” IEEE Transactions on Automatic Control.

———. 1973. “An Innovations Approach to Least-Squares Estimation–Part V: Innovations Representations and Recursive Estimation in Colored Noise.” IEEE Transactions on Automatic Control.

Kailath, and Weinert. 1975. “An RKHS Approach to Detection and Estimation Problems–II: Gaussian Signal Detection.” IEEE Transactions on Information Theory.

Kalman, R. 1959. “On the General Theory of Control Systems.” IRE Transactions on Automatic Control.

Kalman, R. E. 1960. “A New Approach to Linear Filtering and Prediction Problems.” Journal of Basic Engineering.

Kalouptsidis, Mileounis, Babadi, et al. 2011. “Adaptive Algorithms for Sparse System Identification.” Signal Processing.

Karvonen, and Särkkä. 2016. “Approximate State-Space Gaussian Processes via Spectral Transformation.” In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

Kelly, Law, and Stuart. 2014. “Well-Posedness and Accuracy of the Ensemble Kalman Filter in Discrete and Continuous Time.” Nonlinearity.

Kirch, Edwards, Meier, et al. 2019. “Beyond Whittle: Nonparametric Correction of a Parametric Likelihood with a Focus on Bayesian Time Series Analysis.” Bayesian Analysis.

Kitagawa. 1987. “Non-Gaussian State—Space Modeling of Nonstationary Time Series.” Journal of the American Statistical Association.

———. 1996. “Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models.” Journal of Computational and Graphical Statistics.

Kitagawa, and Gersch. 1996. Smoothness Priors Analysis of Time Series. Lecture notes in statistics 116.

Kobayashi, Mark, and Turin. 2011. Probability, Random Processes, and Statistical Analysis: Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance.

Koopman, and Durbin. 2000. “Fast Filtering and Smoothing for Multivariate State Space Models.” Journal of Time Series Analysis.

Krishnan, Shalit, and Sontag. 2017. “Structured Inference Networks for Nonlinear State Space Models.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.

Kulhavý. 1990. “Recursive Nonlinear Estimation: A Geometric Approach.” Automatica.

———. 1996. Recursive Nonlinear Estimation. Lecture Notes in Control and Information Sciences.

Kutschireiter, Surace, Sprekeler, et al. 2015. “Approximate Nonlinear Filtering with a Recurrent Neural Network.” BMC Neuroscience.

Lázaro-Gredilla, Quiñonero-Candela, Rasmussen, et al. 2010. “Sparse Spectrum Gaussian Process Regression.” Journal of Machine Learning Research.

Le Gland, Monbet, and Tran. 2009. “Large Sample Asymptotics for the Ensemble Kalman Filter.”

Lei, Bickel, and Snyder. 2009. “Comparison of Ensemble Kalman Filters Under Non-Gaussianity.” Monthly Weather Review.

Levin. 2017. “The Inner Structure of Time-Dependent Signals.” arXiv:1703.08596 [Cs, Math, Stat].

Lindgren, Rue, and Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology).

Ljung, and Kailath. 1976. “Backwards Markovian Models for Second-Order Stochastic Processes (Corresp.).” IEEE Transactions on Information Theory.

Ljung, Kailath, and Friedlander. 1975. “Scattering Theory and Linear Least Squares Estimation: Part I: Continuous-Time Problems.” In 1975 IEEE Conference on Decision and Control Including the 14th Symposium on Adaptive Processes.

Loeliger, Dauwels, Hu, et al. 2007. “The Factor Graph Approach to Model-Based Signal Processing.” Proceedings of the IEEE.

Manton, Krishnamurthy, and Poor. 1998. “James-Stein State Filtering Algorithms.” IEEE Transactions on Signal Processing.

Mattos, Dai, Damianou, et al. 2016. “Recurrent Gaussian Processes.” In Proceedings of ICLR.

Mattos, Dai, Damianou, et al. 2017. “Deep Recurrent Gaussian Processes for Outlier-Robust System Identification.” Journal of Process Control, DYCOPS-CAB 2016,.

Meyer, Edwards, Maturana-Russel, et al. 2020. “Computational Techniques for Parameter Estimation of Gravitational Wave Signals.” WIREs Computational Statistics.

Micchelli, and Olsen. 2000. “Penalized Maximum-Likelihood Estimation, the Baum–Welch Algorithm, Diagonal Balancing of Symmetric Matrices and Applications to Training Acoustic Data.” Journal of Computational and Applied Mathematics.

Miller, Glennie, and Seaton. 2020. “Understanding the Stochastic Partial Differential Equation Approach to Smoothing.” Journal of Agricultural, Biological and Environmental Statistics.

Nickisch, Solin, and Grigorevskiy. 2018. “State Space Gaussian Processes with Non-Gaussian Likelihood.” In International Conference on Machine Learning.

Olfati-Saber. 2005. “Distributed Kalman Filter with Embedded Consensus Filters.” In 44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05.

Ollivier. 2017. “Online Natural Gradient as a Kalman Filter.” arXiv:1703.00209 [Math, Stat].

Papadopoulos, Pachet, Roy, et al. 2015. “Exact Sampling for Regular and Markov Constraints with Belief Propagation.” In Principles and Practice of Constraint Programming. Lecture Notes in Computer Science.

Perry. 2010. “Andrew Viterbi’s Fabulous Formula [Medal of Honor].” IEEE Spectrum.

Picci. 1991. “Stochastic Realization Theory.” In Mathematical System Theory: The Influence of R. E. Kalman.

Psiaki. 2013. “The Blind Tricyclist Problem and a Comparative Study of Nonlinear Filters: A Challenging Benchmark for Evaluating Nonlinear Estimation Methods.” IEEE Control Systems.

Pugachev, V.S. 1982. “Conditionally Optimal Estimation in Stochastic Differential Systems.” Automatica.

Pugachev, V. S., and Sinit︠s︡yn. 2001. Stochastic systems: theory and applications.

Quiñonero-Candela, and Rasmussen. 2005. “A Unifying View of Sparse Approximate Gaussian Process Regression.” Journal of Machine Learning Research.

Rabiner, L.R. 1989. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.” Proceedings of the IEEE.

Rabiner, L., and Juang. 1986. “An Introduction to Hidden Markov Models.” IEEE ASSP Magazine.

Raol, and Sinha. 1987. “On Pugachev’s Filtering Theory for Stochastic Nonlinear Systems.” In Stochastic Control. IFAC Symposia Series.

Reece, and Roberts. 2010. “An Introduction to Gaussian Processes for the Kalman Filter Expert.” In 2010 13th International Conference on Information Fusion.

Reller. 2013. “State-Space Methods in Statistical Signal Processing: New Ideas and Applications.” Application/pdf.

Revach, Shlezinger, van Sloun, et al. 2021. “Kalmannet: Data-Driven Kalman Filtering.” In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Robertson, Andrew N. 2011. “A Bayesian Approach to Drum Tracking.” In.

Robertson, Andrew, and Plumbley. 2007. “B-Keeper: A Beat-Tracker for Live Performance.” In Proceedings of the 7th International Conference on New Interfaces for Musical Expression. NIME ’07.

Robertson, Andrew, Stark, and Davies. 2013. “Percussive Beat Tracking Using Real-Time Median Filtering.” In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.

Robertson, Andrew, Stark, and Plumbley. 2011. “Real-Time Visual Beat Tracking Using a Comb Filter Matrix.” In Proceedings of the International Computer Music Conference 2011.

Rodriguez, and Ruiz. 2009. “Bootstrap Prediction Intervals in State–Space Models.” Journal of Time Series Analysis.

Roth, Hendeby, Fritsche, et al. 2017. “The Ensemble Kalman Filter: A Signal Processing Perspective.” EURASIP Journal on Advances in Signal Processing.

Rozet, and Louppe. 2023. “Score-Based Data Assimilation.”

Rudenko. 2013. “Optimal Structure of Continuous Nonlinear Reduced-Order Pugachev Filter.” Journal of Computer and Systems Sciences International.

Särkkä, Simo. 2007. “On Unscented Kalman Filtering for State Estimation of Continuous-Time Nonlinear Systems.” IEEE Transactions on Automatic Control.

———. 2013. Bayesian Filtering and Smoothing. Institute of Mathematical Statistics Textbooks 3.

Särkkä, Simo, and Hartikainen. 2012. “Infinite-Dimensional Kalman Filtering Approach to Spatio-Temporal Gaussian Process Regression.” In Artificial Intelligence and Statistics.

Särkkä, S., and Hartikainen. 2013. “Non-Linear Noise Adaptive Kalman Filtering via Variational Bayes.” In 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

Särkkä, Simo, and Nummenmaa. 2009. “Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations.” IEEE Transactions on Automatic Control.

Särkkä, Simo, Solin, and Hartikainen. 2013. “Spatiotemporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing: A Look at Gaussian Process Regression Through Kalman Filtering.” IEEE Signal Processing Magazine.

Schein, Wallach, and Zhou. 2016. “Poisson-Gamma Dynamical Systems.” In Advances In Neural Information Processing Systems.

Schmidt, Krämer, and Hennig. 2021. “A Probabilistic State Space Model for Joint Inference from Differential Equations and Data.” arXiv:2103.10153 [Cs, Stat].

Segall, Davis, and Kailath. 1975. “Nonlinear Filtering with Counting Observations.” IEEE Transactions on Information Theory.

Šindelář, Vajda, and Kárnỳ. 2008. “Stochastic Control Optimal in the Kullback Sense.” Kybernetika.

Sorenson. 1970. “Least-Squares Estimation: From Gauss to Kalman.” IEEE Spectrum.

Städler, and Mukherjee. 2013. “Penalized Estimation in High-Dimensional Hidden Markov Models with State-Specific Graphical Models.” The Annals of Applied Statistics.

Surace, and Pfister. 2016. “Online Maximum Likelihood Estimation of the Parameters of Partially Observed Diffusion Processes.” In.

Tavakoli, and Panaretos. 2016. “Detecting and Localizing Differences in Functional Time Series Dynamics: A Case Study in Molecular Biophysics.” Journal of the American Statistical Association.

Thrun, and Langford. 1998. “Monte Carlo Hidden Markov Models.”

Thrun, Langford, and Fox. 1999. “Monte Carlo Hidden Markov Models: Learning Non-Parametric Models of Partially Observable Stochastic Processes.” In Proceedings of the International Conference on Machine Learning.

Turner, Deisenroth, and Rasmussen. 2010. “State-Space Inference and Learning with Gaussian Processes.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.

Wikle, and Berliner. 2007. “A Bayesian Tutorial for Data Assimilation.” Physica D: Nonlinear Phenomena, Data Assimilation,.

Wikle, Berliner, and Cressie. 1998. “Hierarchical Bayesian Space-Time Models.” Environmental and Ecological Statistics.

Zhao, and Cui. 2023. “Tensor-Based Methods for Sequential State and Parameter Estimation in State Space Models.”

Zoeter. 2007. “Bayesian Generalized Linear Models in a Terabyte World.” In 2007 5th International Symposium on Image and Signal Processing and Analysis.