The Living Thing / Notebooks : M-estimation

Estimating a quantity by choosing it to be the extremum of a function, or, if it’s well-behaved enough, a zero of its derivative.

Very popular with machine learning, where loss-function based methods are ubiquitous. In statistics we see this implicitly in maximum likelihood estimation and robust estimation, and least squares loss, for which M-estimation provides a unifying formalism based on asymptotic theory.

TODO: Discuss large sample theory influence function motivation.

Robust Loss functions

TBD.

Huber loss

Hampel loss

Fitting

Discuss representation (and implementation) in terms of weight functions for least-squares loss.

GM-estimators

Mallows, Schweppe etc.

TBD.

Refs

Barn83
Barndorff-Nielsen, O. (1983) On a formula for the distribution of the maximum likelihood estimator. Biometrika, 70(2), 343–365. DOI.
Bühl14
Bühlmann, P. (2014) Robust Statistics. In J. Fan, Y. Ritov, & C. F. J. Wu (Eds.), Selected Works of Peter J. Bickel (pp. 51–98). Springer New York
DoMo13
Donoho, D., & Montanari, A. (2013) High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing. arXiv:1310.7320 [Cs, Math, Stat].
Hamp74
Hampel, F. R.(1974) The Influence Curve and its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393. DOI.
Hube64
Huber, P. J.(1964) Robust Estimation of a Location Parameter. The Annals of Mathematical Statistics, 35(1), 73–101. DOI.
MoPe10
Mondal, D., & Percival, D. B.(2010) M-estimation of wavelet variance. Annals of the Institute of Statistical Mathematics, 64(1), 27–53. DOI.
Ronc00
Ronchetti, E. (2000) Robust Regression Methods and Model Selection. In A. Bab-Hadiashar & D. Suter (Eds.), Data Segmentation and Model Selection for Computer Vision (pp. 31–40). Springer New York
ThCl13
Tharmaratnam, K., & Claeskens, G. (2013) A comparison of robust versions of the AIC based on M-, S- and MM-estimators. Statistics, 47(1), 216–235. DOI.
Geer14
van de Geer, S. (2014) Worst possible sub-directions in high-dimensional models. In arXiv:1403.7023 [math, stat] (Vol. 131).