The Living Thing / Notebooks :

Estimation of fiddly information functionals of densities

Informing yourself from your data how informative your data was

Say I would like to know the mutual information of the processes generating two streams of observations, with weak assumptions on the form of the generation process. This is a normal sort of empirical probability metric estimation problem

Information is harder than normal, because observations with low frequency have high influence on the estimate. It is easy to get a uselessly biassed – or even inconsistent – estimator, especially in the nonparametric case.

A typical technique, is to construct a joint histogram from your samples, treat the bins as as a finite alphabet and then do the usual calculation. That throws out a lot if information, and it feels clunky and stupid, especially if you suspect your distributions might have some other kind of smoothness that you’d like to exploit.

You cold also estimate the densities. Moreover this method is highly sensitive and can be arbitrarily wrong if you don’t do it right (see Paninski, 2003).

So, better alternatives?

One obvious one is asking yourself: Do I really want to know th information? Or do I merely wish to know that something is uninformative, i.e. to estimate some degree of independence? independence is related, but has much more general strategies.

ITE toolbox (estimators)

To consider: