Estimating a quantity by choosing it to be the extremum of a function, or, if it’s well-behaved enough, a zero of its derivative.

Very popular with machine learning, where loss-function based methods are ubiquitous. In statistics we see this implicitly in maximum likelihood estimation and robust estimation, and least squares loss, for which M-estimation provides a unifying formalism based on asymptotic theory.

TODO: Discuss large sample theory influence function motivation.

## Robust Loss functions

TBD.

### Huber loss

### Hampel loss

## Fitting

Discuss representation (and implementation) in terms of weight functions for least-squares loss.

## GM-estimators

Mallows, Schweppe etc.

TBD.

## Refs

- Barn83
- Barndorff-Nielsen, O. (1983) On a formula for the distribution of the maximum likelihood estimator.
*Biometrika*, 70(2), 343–365. DOI. - Bühl14
- Bühlmann, P. (2014) Robust Statistics. In J. Fan, Y. Ritov, & C. F. J. Wu (Eds.), Selected Works of Peter J. Bickel (pp. 51–98). Springer New York
- DoMo13
- Donoho, D., & Montanari, A. (2013) High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing.
*arXiv:1310.7320 [Cs, Math, Stat]*. - Hamp74
- Hampel, F. R.(1974) The Influence Curve and its Role in Robust Estimation.
*Journal of the American Statistical Association*, 69(346), 383–393. DOI. - Hube64
- Huber, P. J.(1964) Robust Estimation of a Location Parameter.
*The Annals of Mathematical Statistics*, 35(1), 73–101. DOI. - MoPe10
- Mondal, D., & Percival, D. B.(2010) M-estimation of wavelet variance.
*Annals of the Institute of Statistical Mathematics*, 64(1), 27–53. DOI. - Ronc00
- Ronchetti, E. (2000) Robust Regression Methods and Model Selection. In A. Bab-Hadiashar & D. Suter (Eds.), Data Segmentation and Model Selection for Computer Vision (pp. 31–40). Springer New York
- ThCl13
- Tharmaratnam, K., & Claeskens, G. (2013) A comparison of robust versions of the AIC based on M-, S- and MM-estimators.
*Statistics*, 47(1), 216–235. DOI. - Geer14
- van de Geer, S. (2014) Worst possible sub-directions in high-dimensional models. In arXiv:1403.7023 [math, stat] (Vol. 131).