The Living Thing / Notebooks : Independence, conditional, statistical

Testing whether two variables are independent or not, in a general setting. As seen in graphical models.

Connection with model selection, in the sense that accepting enough true hypotheses leaves you with a residual independent of the predictors. (TODO: clarify.)

I’m also interested in how you do this for long-memory dependent data processes - effectively, finding out how long the memory is by seeing how much history you have to take into account before the future is conditionally independent of the past. With ARIMA or markov models, this is precisely the model order, which you can select using, e.g. an information criterion.

Independence test from data

Traditional tests

There are special cases where this is easy, e.g. in binary data we have Chi^2 tests; for Gaussian variables it’s the same as correlation, so the problem is simply one of covariance estimates. Generally, likelihood tests can easily give us what is effectively a test of this in estimation problems in exponential families. (c&c Basu’s lemma.)

Copula tests

If we know the copula and variables are monotonically related we know the dependence structure already.

Information criteria

Information criteria effectively do this. (TODO: clarify.)

Kerenel distribution embedding tests

I’m interested in the nonparametric conditional independence tests of GFTS08, using to kernel tricks, although I don’t quite get how you conditionalise them.

RCIT (StZV17) implements an approximate kernel distribution embedding conditional independence test via kernel approximation:

Constraint-based causal discovery (CCD) algorithms require fast and accurate conditional independence (CI) testing. The Kernel Conditional Independence Test (KCIT) is currently one of the most popular CI tests in the non-parametric setting, but many investigators cannot use KCIT with large datasets because the test scales cubicly with sample size. We therefore devise two relaxations called the Randomized Conditional Independence Test (RCIT) and the Randomized conditional Correlation Test (RCoT) which both approximate KCIT by utilizing random Fourier features. In practice, both of the proposed tests scale linearly with sample size and return accurate p-values much faster than KCIT in the large sample size context. CCD algorithms run with RCIT or RCoT also return graphs at least as accurate as the same algorithms run with KCIT but with large reductions in run time.

Refs

BaSS04
Baba, K., Shibata, R., & Sibuya, M. (2004) Partial Correlation and Conditional Correlation as Measures of Conditional Independence. Australian & New Zealand Journal of Statistics, 46(4), 657–664. DOI.
CaRS15
Cassidy, B., Rae, C., & Solo, V. (2015) Brain Activity: Connectivity, Sparsity, and Mutual Information. IEEE Transactions on Medical Imaging, 34(4), 846–860. DOI.
Camp06
de Campos, L. M.(2006) A Scoring Function for Learning Bayesian Networks based on Mutual Information and Conditional Independence Tests. Journal of Machine Learning Research, 7, 2149–2187.
EmLM03
Embrechts, P., Lindskog, F., & McNeil, A. J.(2003) Modelling dependence with copulas and applications to risk management. Handbook of Heavy Tailed Distributions in Finance, 8(329–384), 1.
GFTS08
Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B., & Smola, A. J.(2008) A Kernel Statistical Test of Independence. In Advances in Neural Information Processing Systems 20: Proceedings of the 2007 Conference. Cambridge, MA: MIT Press
JeKH04
Jebara, T., Kondor, R., & Howard, A. (2004) Probability Product Kernels. J. Mach. Learn. Res., 5, 819–844.
Kac59
Kac, M. (1959) Statistical independence in probability, analysis and number theory. (Nachdr.). Washington, DC: Math. Assoc. of America
MFSS16
Muandet, K., Fukumizu, K., Sriperumbudur, B., & Schölkopf, B. (2016) Kernel Mean Embedding of Distributions: A Review and Beyonds. arXiv:1605.09522 [Cs, Stat].
SSGF12
Sejdinovic, D., Sriperumbudur, B., Gretton, A., & Fukumizu, K. (2012) Equivalence of distance-based and RKHS-based statistics in hypothesis testing. The Annals of Statistics, 41(5), 2263–2291. DOI.
SHSF09
Song, L., Huang, J., Smola, A., & Fukumizu, K. (2009) Hilbert Space Embeddings of Conditional Distributions with Applications to Dynamical Systems. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 961–968). New York, NY, USA: ACM DOI.
SpMe95
Spirtes, P., & Meek, C. (1995) Learning Bayesian networks with discrete variables from data. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining.
SFGS12
Sriperumbudur, B. K., Fukumizu, K., Gretton, A., Schölkopf, B., & Lanckriet, G. R. G.(2012) On the empirical estimation of integral probability metrics. Electronic Journal of Statistics, 6, 1550–1599. DOI.
StZV17
Strobl, E. V., Zhang, K., & Visweswaran, S. (2017) Approximate Kernel-based Conditional Independence Tests for Fast Non-Parametric Causal Discovery. arXiv:1702.03877 [Stat].
Stud05
Studený, M. (2005) Probabilistic conditional independence structures. . London: Springer
Stud16
Studený, M. (2016) Basic facts concerning supermodular functions. arXiv:1612.06599 [Math, Stat].
SuWh07
Su, L., & White, H. (2007) A consistent characteristic function-based test for conditional independence. Journal of Econometrics, 141(2), 807–834. DOI.
SzRi09
Székely, G. J., & Rizzo, M. L.(2009) Brownian distance covariance. The Annals of Applied Statistics, 3(4), 1236–1265. DOI.
SzRB07
Székely, G. J., Rizzo, M. L., & Bakirov, N. K.(2007) Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769–2794. DOI.
Tala96
Talagrand, M. (1996) A new look at independence. The Annals of Probability, 24(1), 1–34.
ThSS16
Thanei, G.-A., Shah, N. M., Rajen D., & Shah, R. D.(2016) The xyz algorithm for fast interaction search in high-dimensional data. Arxiv, 20(9), 846–851.
YaZS16
Yao, S., Zhang, X., & Shao, X. (2016) Testing mutual independence in high dimension via distance covariance. arXiv:1609.09380 [Stat].
Zhan00
Zhang et al. - 2012 - Kernel-based Conditional Independence Test and App.pdf. (n.d.) http://www.arxiv.org/pdf/1202.3775.pdf.
ZPJS12
Zhang, K., Peters, J., Janzing, D., & Schölkopf, B. (2012) Kernel-based Conditional Independence Test and Application in Causal Discovery. arXiv:1202.3775 [Cs, Stat].
ZFGS16
Zhang, Q., Filippi, S., Gretton, A., & Sejdinovic, D. (2016) Large-Scale Kernel Methods for Independence Testing. arXiv:1606.07892 [Stat].