A nonparametric method of approximating something from data by assuming that it's close to the data distribution convolved with some kernel.
This is especially popular the target is a probability density function; Then you are working with a kernel density estimator.
To learn about:

“Effective local sample size”

understand the Frechet derivative + Wiener filtering construction used to derive the optimal kernel shape in BePi11 and OKCC16.
Bandwidth/kernel selection in density estimation
Bernacchia (BePi11) has a neat hack: “self consistency” for simultaneous kernel and distribution inference, i.e. simultaneous deconvolution and bandwidth selection. The idea is removing bias by using simple spectral methods, thereby estimating a kernel which in a certain sense would generate the data that you just observed. The results look similar to finitesample corrections for Gaussian scale parameter estimates, but are not quite Gaussian.
Question: could it work with mixture models too?
Mixture models
Where the number of kernels does not grow as fast as the number of data points, this becomes a mixture model; Or, if you'd like, kernel density estimates are a limiting case of mixture model estimates.
They are so clearly similar that I think it best we not make them both feel awkward by dithering about where the free parameters are. Anyway, they are filed separately. BaLi13, ZeMe97 and Geer96 discuss some useful things common to various convex combination estimators.
Does this work with uncertain point locations?
The fact we can write the kernel density estimate as an integral with a convolution of Dirac deltas immediately suggests that we could write it as a convolution of something else, such as Gaussians. Can we recover wellbehaved estimates in that case? This would be a kind of hierarchical model, possibly a typical Bayesian one.
Does this work with asymmetric kernels?
Almost all the kernel estimates I've seen require KDEs to be symmetric, because of Cline's argument that asymmetric kernels are inadmissible in the class of all (possibly multivariate) densities. Presumably this implies distributions, i.e. oncedifferentiable ones. In particular admissible kernels are those which have “nonnegative Fourier transforms bounded by 1”, which implies symmetry about the axis. If we have an a priori constrained class of densities, this might not apply.
Fast Gauss Transform and Fast multipole methods
How to make these methods computationally feasible at scale. See Fast Gauss Transform and other related fast multipole methods.
Refs
 OKCC16: (2016) A fast and objective multidimensional kernel density estimation method: fastKDE. Computational Statistics & Data Analysis, 101, 148–160. DOI
 BaHu86: (1986) A hierarchical O(N log N) forcecalculation algorithm. Nature, 324(6096), 446–449. DOI
 Digg85: (1985) A Kernel Method for Smoothing Point Process Data. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 34(2), 138–147. DOI
 PJJK03: (2003) Adaptive variable location kernel density estimators with good performance at boundaries. Journal of Nonparametric Statistics, 15(1), 61–75. DOI
 WeWu15: (2015) An Improved TransformationBased Kernel Estimator of Densities on the Unit Interval. Journal of the American Statistical Association, 110(510), 773–783. DOI
 BaHy01: (2001) Bandwidth selection for kernel conditional density estimation. Computational Statistics & Data Analysis, 36(3), 279–298. DOI
 WaWa07: (2007) Bandwidth selection for weighted kernel density estimation. ArXiv Preprint ArXiv:0709.1616.
 LiCM12: (2012) Blind Image Deblurring by Spectral Properties of Convolution Operators. ArXiv:1209.2082 [Cs].
 MaHa10: (2010) Boundary kernels for adaptive density estimators on regions with irregular boundaries. Journal of Multivariate Analysis, 101(4), 949–963. DOI
 ZhKa10: (2010) Boundary performance of the beta kernel estimators. Journal of Nonparametric Statistics, 22(1), 81–104. DOI
 HeWZ03: (2003) Consistent estimation of the intensity function of a cyclic Poisson process. Journal of Multivariate Analysis, 84(1), 19–39. DOI
 KoMi06: (2006) Density estimation by total variation regularization. Advances in Statistical Modeling and Inference, 613–634.
 Elli91: (1991) Density estimation for point processes. Stochastic Processes and Their Applications, 39(2), 345–358. DOI
 ZeMe97: (1997) Density Estimation Through Convex Combinations of Densities: Approximation and Estimation Bounds. Neural Networks: The Official Journal of the International Neural Network Society, 10(1), 99–109. DOI
 SmLe05: (2005) Efficient Coding of TimeRelative Structure Using Spikes. Neural Computation, 17(1), 19–45. DOI
 YaDD04: (2004) Efficient kernel machines using the improved fast Gauss transform. In Advances in neural information processing systems (pp. 1561–1568).
 BeDi89: (1989) Estimating Weighted Integrals of the SecondOrder Intensity of a Spatial Point Process. Journal of the Royal Statistical Society. Series B (Methodological) , 51(1), 81–92.
 Rath96: (1996) Estimation of Poisson intensity using partially observed concomitant variables. Biometrics, 226–242.
 YDGD03: (2003) Improved Fast Gauss Transform and Efficient Kernel Density Estimation. In Proceedings of the Ninth IEEE International Conference on Computer Vision  Volume 2 (pp. 464–). Washington, DC, USA: IEEE Computer Society DOI
 Cuca08: (2008) Intensity Estimation for Spatial Point Processes Observed with Noise. Scandinavian Journal of Statistics, 35(2), 322–334. DOI
 BoGK10: (2010) Kernel density estimation via diffusion. The Annals of Statistics, 38(5), 2916–2957. DOI
 DoHa15: (2015) Making a nonparametric density estimator more attractive, and more accurate, by data perturbation. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 78(2), 445–462. DOI
 BaTu06: (2006) Modelling Spatial Point Patterns in R. In Case Studies in Spatial Point Process Modeling (pp. 23–74). Springer New York
 HaPa02: (2002) New Methods for Bias Correction at Endpoints and Boundaries. The Annals of Statistics, 30(5), 1460–1479. DOI
 PaKo12: (2012) Nonparametric Construction of Multivariate Kernels. Journal of the American Statistical Association, 107(499), 1085–1095. DOI
 Aale78: (1978) Nonparametric Inference for a Family of Counting Processes. The Annals of Statistics, 6(4), 701–726. DOI
 MaSc14: (2014) Nonparametric kernel density estimation near the boundary. Computational Statistics & Data Analysis, 72, 57–76. DOI
 Lies11: (2011) On Estimation of the Intensity Function of a Point Process. Methodology and Computing in Applied Probability, 14(3), 567–578. DOI
 Hall87: (1987) On KullbackLeibler Loss and Density Estimation. The Annals of Statistics, 15(4), 1491–1519. DOI
 Digg79: (1979) On Parameter Estimation and GoodnessofFit Testing for Spatial Point Patterns. Biometrics, 35(1), 87–101. DOI
 Silv82: (1982) On the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method. The Annals of Statistics, 10(3), 795–810. DOI
 CrMí14: (2014) Particlekernel estimation of the filter density in statespace models. Bernoulli, 20(4), 1879–1929. DOI
 AdSc09: (2009) Point process diagnostics based on weighted secondorder statistics and their asymptotic properties. Annals of the Institute of Statistical Mathematics, 61(4), 929–948. DOI
 Geen14: (2014) Probit Transformation for Kernel Density Estimation on the Unit Interval. Journal of the American Statistical Association, 109(505), 346–358. DOI
 Geer96: (1996) Rates of convergence for the maximum likelihood estimator in mixture models. Journal of Nonparametric Statistics, 6(4), 293–310. DOI
 BTMH05: (2005) Residual analysis for spatial point processes (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 67(5), 617–666. DOI
 BePi11: (2011) Selfconsistent method for density estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 73(3), 407–422. DOI
 DíJM12: (2012) Similarity measures of conditional intensity functions to test separability in multidimensional point processes. Stochastic Environmental Research and Risk Assessment, 27(5), 1193–1205. DOI
 BaLi13: (2013) Smooth projected density estimation. ArXiv:1308.3968 [Stat].
 Stei05: (2005) Spacetime covariance functions. Journal of the American Statistical Association, 100(469), 310–321. DOI
 GrSt91: (1991) The Fast Gauss Transform. SIAM Journal on Scientific and Statistical Computing, 12(1), 79–94. DOI
 RaDu05: (2005) The improved fast Gauss transform with applications to machine learning
 Gisb03: (2003) Weighted samples, kernel density estimators and convergence. Empirical Economics, 28(2), 335–351. DOI