# Independence, conditional, statistical

Whether two variables are independent or not, in a general setting. As seen in directed graphical models.

Connection with model selection, in the sense that accepting enough true hypotheses leaves you with a residual independent of the predictors. (TODO: clarify.)

## Tests

If you don’t merely want to know whether two things are dependent, but how far apart they are, you may want to estimate a probability metric from data.

There are special cases where this is easy, e.g. in binary data we have Chi^2 tests; for Gaussian variables it’s the same as correlation, so the problem is simply one of covariance estimates. Generally, likelihood tests can easily give us what is effectively a test of this in estimation problems in exponential families. (c&c Basu’s lemma.)

### Copula tests

If we know the copula and variables are monotonically related we know the dependence structure already.

### Information criteria

Information criteria effectively do this. (TODO: clarify.)

### Kernel distribution embedding tests

I’m interested in the nonparametric conditional independence tests of GFTS08, using to kernel tricks, although I don’t quite get how you conditionalise them.

RCIT (StZV17) implements an approximate kernel distribution embedding conditional independence test via kernel approximation:

Constraint-based causal discovery (CCD) algorithms require fast and accurate conditional independence (CI) testing. The Kernel Conditional Independence Test (KCIT) is currently one of the most popular CI tests in the non-parametric setting, but many investigators cannot use KCIT with large datasets because the test scales cubicly with sample size. We therefore devise two relaxations called the Randomized Conditional Independence Test (RCIT) and the Randomized conditional Correlation Test (RCoT) which both approximate KCIT by utilizing random Fourier features. In practice, both of the proposed tests scale linearly with sample size and return accurate p-values much faster than KCIT in the large sample size context. CCD algorithms run with RCIT or RCoT also return graphs at least as accurate as the same algorithms run with KCIT but with large reductions in run time.

ITE toolbox (estimators)