Estimating the thing that is always given to you by oracles in homework assignments. The meat of Gaussian process regression

Estimating the covariance, precision, concentration matrices of things. Turns about to be a lot more involved than estimating means in various ways and at various times. Long story.

Now, why did I want to know this again? I think it may have been something about minimalist GRF inference for the Synestizer project. Right.

Connections also to random matrix theory (Ben Arous et al). \(\mathcal{H}\)-matrix methods.

## Parametric covariance models

I don’t know anything about this, but for spatial statistics I am told I should look up Matérn covariance matrices for a parametric covariance field.

## Non-stationary covariance models

Particular reference to dynamically updating covariance estimates for a possibly-evolving system. This not quite the Kalman filter problem, since that presumes the (co)variance of our estimates, which is to say precision, gets updated, but that the (co)variance of the presumed process is stationary. I just learned, thanks to the retirement lecture of Hans-Ruedi Künsch that one solution to this problem might in fact be the Ensemble Kalman Filter.

## An inverse problem

given a time-series autocorrelation lag, devise an algorithm to produce a sequence with this structure.

- for deterministic autocorrelation
- for probabilistic autocorrelation in expectation

Surely the ARIMA jockeys do this, yes? Granger et al? But that presumes short memory. Should I check the fractal/multifractal literature here?

## To read

Basic inference using Inverse Wishart -by having a very basic “process model” that increases unvertainty of the covariance estimate as some convenient monotonic function of time, i should be able to get this one.

AzKS15 have a neat snark:

Our work deviates from the majority of work on compressive covariance estimation in that we do not make structural assumptions on the estimand, in this case the target covariance. A number of papers assume that the target covariance is low rank, sparse, or that the inverse covariance is sparse The broad theme of this line of work is that when the target covariance has some low-dimensional structure, far fewer total measurements (via random project) are necessary to achieve the same error as direct observation in the unstructured case. However when the target covariance does not have low-dimensional structure, these methods can fail dramatically, as we show with our lower bounds.

In contrast, our work instead examines the statistical price one pays for compressing the data vectors when the covariance matrix does not exhibit any low dimensional structure. Instead of using fewer measurements than direct observation, in this setting, compressing the data requires that one use significantly more measurements to achieve the same level of accuracy as direct observation. We precisely quantify this increase in measurement, showing that the effective sample size shifts from \(n\) to \(nm^2/d^2\), where the projection dimension is m and the ambient dimension is d. Since we must have m ≤ d, this means that one needs more samples to achieve a specified accuracy under our measurement model, when compared with direct observation. This effective sample size is present in all of our upper and lower bounds, showing that indeed, there is a price to pay for compression without structural assumptions. Note that this quadratic growth in effective sample size also matches recent results on covariance estimation from missing data

## Refs

- Abra97
- Abrahamsen, P. (1997) A review of Gaussian random fields and correlation functions.
- AzKS15
- Azizyan, M., Krishnamurthy, A., & Singh, A. (2015) Extreme Compressive Sampling for Covariance Estimation.
*arXiv:1506.00898 [Cs, Math, Stat]*. - BaAP05
- Baik, J., Arous, G. B., & Péché, S. (2005) Phase Transition of the Largest Eigenvalue for Nonnull Complex Sample Covariance Matrices.
*The Annals of Probability*, 33(5), 1643–1697. - BaGA08
- Banerjee, O., Ghaoui, L. E., & d’Aspremont, A. (2008) Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data.
*Journal of Machine Learning Research*, 9(Mar), 485–516. - BaMM00
- Barnard, J., McCulloch, R., & Meng, X.-L. (2000) Modeling Covariance Matrices in Terms of Standard Deviations and Correlations, with Application to Shrinkage.
*Statistica Sinica*, 10(4), 1281–1311. - BePé05
- Ben Arous, G., & Péché, S. (2005) Universality of local eigenvalue statistics for some sample covariance matrices.
*Communications on Pure and Applied Mathematics*, 58(10), 1316–1357. DOI. - CaZZ10
- Cai, T. T., Zhang, C.-H., & Zhou, H. H.(2010) Optimal rates of convergence for covariance matrix estimation.
*The Annals of Statistics*, 38(4), 2118–2144. DOI. - DaPo09
- Daniels, M. J., & Pourahmadi, M. (2009) Modeling covariance matrices via partial autocorrelations.
*Journal of Multivariate Analysis*, 100(10), 2352–2363. DOI. - Efro10
- Efron, B. (2010) Correlated z-values and the accuracy of large-scale statistical estimates.
*Journal of the American Statistical Association*, 105(491), 1042–1055. DOI. - FrHT08
- Friedman, J., Hastie, T., & Tibshirani, R. (2008) Sparse inverse covariance estimation with the graphical lasso.
*Biostatistics*, 9(3), 432–441. DOI. - Fuen06
- Fuentes, M. (2006) Testing for separability of spatial–temporal covariance functions.
*Journal of Statistical Planning and Inference*, 136(2), 447–466. DOI. - Hack15
- Hackbusch, W. (2015) Hierarchical Matrices: Algorithms and Analysis. (1st ed.). Heidelberg New York Dordrecht London: Springer Publishing Company, Incorporated
- Hans07
- Hansen, C. B.(2007) Generalized least squares inference in panel and multilevel models with serial correlation and fixed effects.
*Journal of Econometrics*, 140(2), 670–694. DOI. - HePo14
- Heinrich, C., & Podolskij, M. (2014) On spectral distribution of high dimensional covariation matrices.
*arXiv:1410.6764 [Math]*. - HSDR14
- Hsieh, C.-J., Sustik, M. A., Dhillon, I. S., & Ravikumar, P. D.(2014) QUIC: quadratic approximation for sparse inverse covariance estimation.
*Journal of Machine Learning Research*, 15(1), 2911–2947. - HLPL06
- Huang, J. Z., Liu, N., Pourahmadi, M., & Liu, L. (2006) Covariance matrix selection and estimation via penalised normal likelihood.
*Biometrika*, 93(1), 85–98. DOI. - JaSt61
- James, W., & Stein, C. (1961) Estimation with quadratic loss. In Proceedings of the fourth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 361–379).
- JaGe15
- Janková, J., & van de Geer, S. (2015) Honest confidence regions and optimality in high-dimensional precision matrix estimation.
*arXiv:1507.02061 [Math, Stat]*. - KhLM09
- Khoromskij, B. N., Litvinenko, A., & Matthies, H. G.(2009) Application of hierarchical matrices for computing the Karhunen–Loève expansion.
*Computing*, 84(1–2), 49–67. DOI. - Khos12
- Khoshgnauz, E. (2012) Learning Markov Network Structure using Brownian Distance Covariance.
*arXiv:1206.6361 [Cs, Stat]*. - KrSh09
- Krumin, M., & Shoham, S. (2009) Generation of Spike Trains with Controlled Auto- and Cross-Correlation Functions.
*Neural Computation*, 21(6), 1642–1664. DOI. - LaFa09
- Lam, C., & Fan, J. (2009) Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.
*Annals of Statistics*, 37(6B), 4254–4278. DOI. - LeWo04
- Ledoit, O., & Wolf, M. (2004) A well-conditioned estimator for large-dimensional covariance matrices.
*Journal of Multivariate Analysis*, 88(2), 365–411. DOI. - Loh91
- Loh, W.-L. (1991) Estimating covariance matrices II.
*Journal of Multivariate Analysis*, 36(2), 163–174. DOI. - MaMa84
- Mardia, K. V., & Marshall, R. J.(1984) Maximum likelihood estimation of models for residual covariance in spatial regression.
*Biometrika*, 71(1), 135–146. DOI. - MeBü06
- Meinshausen, N., & Bühlmann, P. (2006) High-dimensional graphs and variable selection with the lasso.
*The Annals of Statistics*, 34(3), 1436–1462. DOI. - MiMc05
- Minasny, B., & McBratney, A. B.(2005) The Matérn function as a general model for soil variograms.
*Geoderma*, 128(3–4), 192–207. DOI. - NoLi13
- Nowak, W., & Litvinenko, A. (2013) Kriging and Spatial Design Accelerated by Orders of Magnitude: Combining Low-Rank Covariance Approximations with FFT-Techniques.
*Mathematical Geosciences*, 45(4), 411–435. DOI. - Péba08
- Pébay, P. (2008) Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments.
*Sandia Report SAND2008-6212, Sandia National Laboratories*. - RaWe14
- Ramdas, A., & Wehbe, L. (2014) Stein Shrinkage for Cross-Covariance Operators and Kernel Independence Testing.
*arXiv:1406.1922 [Stat]*. - RaWi06
- Rasmussen, C. E., & Williams, C. K. I.(2006) Gaussian processes for machine learning. . Cambridge, Mass: MIT Press
- RWRY11
- Ravikumar, P., Wainwright, M. J., Raskutti, G., & Yu, B. (2011) High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence.
*Electronic Journal of Statistics*, 5, 935–980. DOI. - Rose84
- Rosenblatt, M. (1984) Asymptotic Normality, Strong Mixing and Spectral Density Estimates.
*The Annals of Probability*, 12(4), 1167–1180. DOI. - SaGu92
- Sampson, P. D., & Guttorp, P. (1992) Nonparametric estimation of nonstationary spatial covariance structure.
*Journal of the American Statistical Association*, 87(417), 108–119. - ScSt05
- Schäfer, J., & Strimmer, K. (2005) A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.
*Statistical Applications in Genetics and Molecular Biology*, 4, Article32. DOI. - ShWu07
- Shao, X., & Wu, W. B.(2007) Asymptotic spectral theory for nonlinear time series.
*The Annals of Statistics*, 35(4), 1773–1801. DOI. - Stei05
- Stein, M. L.(2005) Space-time covariance functions.
*Journal of the American Statistical Association*, 100(469), 310–321. DOI. - SuSt16
- Sun, Y., & Stein, M. L.(2016) Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets.
*Journal of Computational and Graphical Statistics*, 25(1), 187–208. DOI. - Take84
- Takemura, A. (1984) An Orthogonally Invariant Minimax Estimator of the Covariance Matrix of a Multivariate Normal Population.
*Tsukuba Journal of Mathematics*, 8(2), 367–376. - YuLi07
- Yuan, M., & Lin, Y. (2007) Model selection and estimation in the Gaussian graphical model.
*Biometrika*, 94(1), 19–35. DOI. - ZhZo14
- Zhang, T., & Zou, H. (2014) Sparse precision matrix estimation via lasso penalized D-trace loss.
*Biometrika*, 101(1), 103–120. DOI.