The Living Thing / Notebooks :

Densities and intensities

The other weird normalization problem

The interpretation of densities as intensities and vice versa


Consider the problem of estimating the common density \(f(x\theta)=dF(x)\) density of indexed i.i.d. random variables \(\{X_i\}_{i\leq n}\in \mathbb{R}^d\) from \(n\) realisations of those variables, \(\{x_i\}_{i\leq n}\) and \(F:\mathbb{R}^d\rightarrow[0,1].\) We assume the state is absolutely continuous with respect to the Lebesgue measure, i.e. \(\mu(A)=0\Rightarrow P(X_i\in A)=0\). Amongst other things, this implies that \(P(X_i)=P(X_j)=0\text{ for }i\neq j\) and that the density exists as a standard function (i.e. we do not need to consider generalised functions such as distributions to handle atoms in \(F\) etc.)

Here we will give the density a finite parameter vector \(\theta\), i.e. \(f(x;\theta)=dF(x;\theta)\), whose value completely characterises the density; the problem of estimating the density is then the same as the one of estimating \(\theta.\)

In the method of maximum likelihood estimation we seek to maximise the value of the empirical likelihood of the data. That is, we choose a parameter estimate \(\hat{\theta}\) to satisfy

\[ \begin{align*} \hat{\theta} &:=\operatorname{argmax}_\theta\prod_i f(x_i;\theta)\\\\ &=\operatorname{argmax}_\theta\sum_i \log f(x_i;\theta) \end{align*} \]

Let’s consider the case where we try to estimate this function by constructing it from some basis of \(p\( functions\)j: ^d({filename} get me here? (see Ch21 of Ramsay and Silverman) 2. Can I use point process estimation theory to improve density estimation? After all, normal point-process estimation claims to be an un-normalised vesion of density estimation. Lies11 draws some parallels there, esp. with mixture models.