# The interpretation of densities as intensities and vice versa

## Densities

Consider the problem of estimating the common density $$f(x\theta)=dF(x)$$ density of indexed i.i.d. random variables $$\{X_i\}_{i\leq n}\in \mathbb{R}^d$$ from $$n$$ realisations of those variables, $$\{x_i\}_{i\leq n}$$ and $$F:\mathbb{R}^d\rightarrow[0,1].$$ We assume the state is absolutely continuous with respect to the Lebesgue measure, i.e. $$\mu(A)=0\Rightarrow P(X_i\in A)=0$$. Amongst other things, this implies that $$P(X_i)=P(X_j)=0\text{ for }i\neq j$$ and that the density exists as a standard function (i.e. we do not need to consider generalised functions such as distributions to handle atoms in $$F$$ etc.)

Here we will give the density a finite parameter vector $$\theta$$, i.e. $$f(x;\theta)=dF(x;\theta)$$, whose value completely characterises the density; the problem of estimating the density is then the same as the one of estimating $$\theta.$$

In the method of maximum likelihood estimation we seek to maximise the value of the empirical likelihood of the data. That is, we choose a parameter estimate $$\hat{\theta}$$ to satisfy

\begin{align*} \hat{\theta} &:=\operatorname{argmax}_\theta\prod_i f(x_i;\theta)\\\\ &=\operatorname{argmax}_\theta\sum_i \log f(x_i;\theta) \end{align*}

Let’s consider the case where we try to estimate this function by constructing it from some basis of $$p\( functions$$j: ^d({filename}functional_data.md) get me here? (see Ch21 of Ramsay and Silverman) 2. Can I use point process estimation theory to improve density estimation? After all, normal point-process estimation claims to be an un-normalised vesion of density estimation. Lies11 draws some parallels there, esp. with mixture models.