# Kernel density estimators

Usefulness: đź”§
Novelty: đź’ˇ
Uncertainty: đź¤Ş đź¤Ş đź¤Ş
Incompleteness: đźš§ đźš§ đźš§

A nonparametric method of approximating something from data by assuming that itâ€™s close to the data distribution convolved with some kernel.

This is especially popular the target is a probability density function; Then you are working with a kernel density estimator.

To learn about:

• â€śEffective local sample sizeâ€ť

• understand the Frechet derivative + Wiener filtering construction used to derive the optimal kernel shape in BePi11 and OKCC16.

### Bandwidth/kernel selection in density estimation

Bernacchia (BePi11) has a neat hack: â€śself consistencyâ€ť for simultaneous kernel and distribution inference, i.e.Â simultaneous deconvolution and bandwidth selection. The idea is removing bias by using simple spectral methods, thereby estimating a kernel which in a certain sense would generate the data that you just observed. The results look similar to finite-sample corrections for Gaussian scale parameter estimates, but are not quite Gaussian.

Question: could it work with mixture models too?

### Mixture models

Where the number of kernels does not grow as fast as the number of data points, this becomes a mixture model; Or, if youâ€™d like, kernel density estimates are a limiting case of mixture model estimates.

They are so clearly similar that I think it best we not make them both feel awkward by dithering about where the free parameters are. Anyway, they are filed separately. BaLi13, ZeMe97 and Geer96 discuss some useful things common to various convex combination estimators.

### Does this work with uncertain point locations?

The fact we can write the kernel density estimate as an integral with a convolution of Dirac deltas immediately suggests that we could write it as a convolution of something else, such as Gaussians. Can we recover well-behaved estimates in that case? This would be a kind of hierarchical model, possibly a typical Bayesian one.

### Does this work with asymmetric kernels?

Almost all the kernel estimates Iâ€™ve seen require KDEs to be symmetric, because of Clineâ€™s argument that asymmetric kernels are inadmissible in the class of all (possibly multivariate) densities. Presumably this implies $$\mathcal{C}_1$$ distributions, i.e.Â once-differentiable ones. In particular admissible kernels are those which have â€śnonnegative Fourier transforms bounded by 1â€ť, which implies symmetry about the axis. If we have an a priori constrained class of densities, this might not apply.

### Fast Gauss Transform and Fast multipole methods

How to make these methods computationally feasible at scale. See Fast Gauss Transform and other related fast multipole methods.

# Refs

Aalen, Odd. 1978. â€śNonparametric Inference for a Family of Counting Processes.â€ť The Annals of Statistics 6 (4): 701â€“26. https://doi.org/10.1214/aos/1176344247.

Adelfio, Giada, and Frederic Paik Schoenberg. 2009. â€śPoint Process Diagnostics Based on Weighted Second-Order Statistics and Their Asymptotic Properties.â€ť Annals of the Institute of Statistical Mathematics 61 (4): 929â€“48. https://doi.org/10.1007/s10463-008-0177-1.

Baddeley, Adrian, and Rolf Turner. 2006. â€śModelling Spatial Point Patterns in R.â€ť In Case Studies in Spatial Point Process Modeling, edited by Adrian Baddeley, Pablo Gregori, Jorge Mateu, Radu Stoica, and Dietrich Stoyan, 23â€“74. Lecture Notes in Statistics 185. Springer New York. http://link.springer.com/chapter/10.1007/0-387-31144-0_2.

Baddeley, A., R. Turner, J. MĂ¸ller, and M. Hazelton. 2005. â€śResidual Analysis for Spatial Point Processes (with Discussion).â€ť Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (5): 617â€“66. https://doi.org/10.1111/j.1467-9868.2005.00519.x.

Barnes, Josh, and Piet Hut. 1986. â€śA Hierarchical O(N Log N) Force-Calculation Algorithm.â€ť Nature 324 (6096): 446â€“49. https://doi.org/10.1038/324446a0.

Bashtannyk, David M., and Rob J. Hyndman. 2001. â€śBandwidth Selection for Kernel Conditional Density Estimation.â€ť Computational Statistics & Data Analysis 36 (3): 279â€“98. https://doi.org/10.1016/S0167-9473(00)00046-3.

Battey, Heather, and Han Liu. 2013. â€śSmooth Projected Density Estimation,â€ť August. http://arxiv.org/abs/1308.3968.

Berman, Mark, and Peter Diggle. 1989. â€śEstimating Weighted Integrals of the Second-Order Intensity of a Spatial Point Process.â€ť Journal of the Royal Statistical Society. Series B (Methodological) 51 (1): 81â€“92. https://publications.csiro.au/rpr/pub?list=BRO&pid=procite:d5b7ecd7-435c-4dab-9063-f1cf2fbdf4cb.

Bernacchia, Alberto, and Simone Pigolotti. 2011. â€śSelf-Consistent Method for Density Estimation.â€ť Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (3): 407â€“22. https://doi.org/10.1111/j.1467-9868.2011.00772.x.

Botev, Z. I., J. F. Grotowski, and D. P. Kroese. 2010. â€śKernel Density Estimation via Diffusion.â€ť The Annals of Statistics 38 (5): 2916â€“57. https://doi.org/10.1214/10-AOS799.

Crisan, Dan, and JoaquĂ­n MĂ­guez. 2014. â€śParticle-Kernel Estimation of the Filter Density in State-Space Models.â€ť Bernoulli 20 (4): 1879â€“1929. https://doi.org/10.3150/13-BEJ545.

Cucala, Lionel. 2008. â€śIntensity Estimation for Spatial Point Processes Observed with Noise.â€ť Scandinavian Journal of Statistics 35 (2): 322â€“34. https://doi.org/10.1111/j.1467-9469.2007.00583.x.

Diggle, Peter. 1985. â€śA Kernel Method for Smoothing Point Process Data.â€ť Journal of the Royal Statistical Society. Series C (Applied Statistics) 34 (2): 138â€“47. https://doi.org/10.2307/2347366.

Diggle, Peter J. 1979. â€śOn Parameter Estimation and Goodness-of-Fit Testing for Spatial Point Patterns.â€ť Biometrics 35 (1): 87â€“101. https://doi.org/10.2307/2529938.

DĂ­az-Avalos, Carlos, P. Juan, and J. Mateu. 2012. â€śSimilarity Measures of Conditional Intensity Functions to Test Separability in Multidimensional Point Processes.â€ť Stochastic Environmental Research and Risk Assessment 27 (5): 1193â€“1205. https://doi.org/10.1007/s00477-012-0654-1.

Doosti, Hassan, and Peter Hall. 2015. â€śMaking a Non-Parametric Density Estimator More Attractive, and More Accurate, by Data Perturbation.â€ť Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78 (2): 445â€“62. https://doi.org/10.1111/rssb.12120.

Ellis, Steven P. 1991. â€śDensity Estimation for Point Processes.â€ť Stochastic Processes and Their Applications 39 (2): 345â€“58. https://doi.org/10.1016/0304-4149(91)90087-S.

Geenens, Gery. 2014. â€śProbit Transformation for Kernel Density Estimation on the Unit Interval.â€ť Journal of the American Statistical Association 109 (505): 346â€“58. https://doi.org/10.1080/01621459.2013.842173.

Geer, Sara van de. 1996. â€śRates of Convergence for the Maximum Likelihood Estimator in Mixture Models.â€ť Journal of Nonparametric Statistics 6 (4): 293â€“310. https://doi.org/10.1080/10485259608832677.

Gisbert, Francisco J. Goerlich. 2003. â€śWeighted Samples, Kernel Density Estimators and Convergence.â€ť Empirical Economics 28 (2): 335â€“51. https://doi.org/10.1007/s001810200134.

Greengard, L., and J. Strain. 1991. â€śThe Fast Gauss Transform.â€ť SIAM Journal on Scientific and Statistical Computing 12 (1): 79â€“94. https://doi.org/10.1137/0912004.

Hall, Peter. 1987. â€śOn Kullback-Leibler Loss and Density Estimation.â€ť The Annals of Statistics 15 (4): 1491â€“1519. https://doi.org/10.1214/aos/1176350606.

Hall, Peter, and Byeong U. Park. 2002. â€śNew Methods for Bias Correction at Endpoints and Boundaries.â€ť The Annals of Statistics 30 (5): 1460â€“79. https://doi.org/10.1214/aos/1035844983.

Helmers, Roelof, I. Wayan Mangku, and RiÄŤardas Zitikis. 2003. â€śConsistent Estimation of the Intensity Function of a Cyclic Poisson Process.â€ť Journal of Multivariate Analysis 84 (1): 19â€“39. https://doi.org/10.1016/S0047-259X(02)00008-8.

Koenker, Roger, and Ivan Mizera. 2006. â€śDensity Estimation by Total Variation Regularization.â€ť Advances in Statistical Modeling and Inference, 613â€“34. http://ysidro.econ.uiuc.edu/~roger/research/densiles/Doksum.pdf.

Lieshout, Marie-Colette N. M. van. 2011. â€śOn Estimation of the Intensity Function of a Point Process.â€ť Methodology and Computing in Applied Probability 14 (3): 567â€“78. https://doi.org/10.1007/s11009-011-9244-9.

Liu, Guangcan, Shiyu Chang, and Yi Ma. 2012. â€śBlind Image Deblurring by Spectral Properties of Convolution Operators,â€ť September. http://arxiv.org/abs/1209.2082.

Malec, Peter, and Melanie Schienle. 2014. â€śNonparametric Kernel Density Estimation Near the Boundary.â€ť Computational Statistics & Data Analysis 72 (April): 57â€“76. https://doi.org/10.1016/j.csda.2013.10.023.

Marshall, Jonathan C., and Martin L. Hazelton. 2010. â€śBoundary Kernels for Adaptive Density Estimators on Regions with Irregular Boundaries.â€ť Journal of Multivariate Analysis 101 (4): 949â€“63. https://doi.org/10.1016/j.jmva.2009.09.003.

Oâ€™Brien, Travis A., Karthik Kashinath, Nicholas R. Cavanaugh, William D. Collins, and John P. Oâ€™Brien. 2016. â€śA Fast and Objective Multidimensional Kernel Density Estimation Method: fastKDE.â€ť Computational Statistics & Data Analysis 101 (September): 148â€“60. https://doi.org/10.1016/j.csda.2016.02.014.

Panaretos, Victor M., and Kjell Konis. 2012. â€śNonparametric Construction of Multivariate Kernels.â€ť Journal of the American Statistical Association 107 (499): 1085â€“95. https://doi.org/10.1080/01621459.2012.695657.

Park, B. U., Seok-Oh Jeong, M. C. Jones, and Kee-Hoon Kang. 2003. â€śAdaptive Variable Location Kernel Density Estimators with Good Performance at Boundaries.â€ť Journal of Nonparametric Statistics 15 (1): 61â€“75. https://doi.org/10.1080/10485250306041.

Rathbun, Stephen L. 1996. â€śEstimation of Poisson Intensity Using Partially Observed Concomitant Variables.â€ť Biometrics, 226â€“42. http://www.jstor.org/stable/2533158.

Raykar, Vikas C., and Ramani Duraiswami. 2005. â€śThe Improved Fast Gauss Transform with Applications to Machine Learning.â€ť presented at the NIPS. http://www.umiacs.umd.edu/users/vikas/publications/IFGT_slides.pdf.

Silverman, B. W. 1982. â€śOn the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method.â€ť The Annals of Statistics 10 (3): 795â€“810. https://doi.org/10.1214/aos/1176345872.

Smith, Evan, and Michael S. Lewicki. 2005. â€śEfficient Coding of Time-Relative Structure Using Spikes.â€ť Neural Computation 17 (1): 19â€“45. https://doi.org/10.1162/0899766052530839.

Stein, Michael L. 2005. â€śSpace-Time Covariance Functions.â€ť Journal of the American Statistical Association 100 (469): 310â€“21. https://doi.org/10.1198/016214504000000854.

Wang, Bin, and Xiaofeng Wang. 2007. â€śBandwidth Selection for Weighted Kernel Density Estimation.â€ť arXiv Preprint arXiv:0709.1616. http://www.planchet.net/EXT/ISFA/1226.nsf/769998e0a65ea348c1257052003eb94f/ede860a850cc6634c12573bb004a2413/\$FILE/Kernel.pdf.

Wen, Kuangyu, and Ximing Wu. 2015. â€śAn Improved Transformation-Based Kernel Estimator of Densities on the Unit Interval.â€ť Journal of the American Statistical Association 110 (510): 773â€“83. https://doi.org/10.1080/01621459.2014.969426.

Yang, Changjiang, Ramani Duraiswami, and Larry S. Davis. 2004. â€śEfficient Kernel Machines Using the Improved Fast Gauss Transform.â€ť In Advances in Neural Information Processing Systems, 1561â€“8. http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2005_439.pdf.

Yang, Changjiang, Ramani Duraiswami, Nail A. Gumerov, and Larry Davis. 2003. â€śImproved Fast Gauss Transform and Efficient Kernel Density Estimation.â€ť In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, 464. ICCV â€™03. Washington, DC, USA: IEEE Computer Society. https://doi.org/10.1109/ICCV.2003.1238383.

Zeevi, Assaf J., and Ronny Meir. 1997. â€śDensity Estimation Through Convex Combinations of Densities: Approximation and Estimation Bounds.â€ť Neural Networks: The Official Journal of the International Neural Network Society 10 (1): 99â€“109. https://doi.org/10.1016/S0893-6080(96)00037-8.

Zhang, Shunpu, and Rohana J. Karunamuni. 2010. â€śBoundary Performance of the Beta Kernel Estimators.â€ť Journal of Nonparametric Statistics 22 (1): 81â€“104. https://doi.org/10.1080/10485250903124984.