# Robust statistics

Usefulness: đź”§
Novelty: đź’ˇ
Uncertainty: đź¤Ş đź¤Ş đź¤Ş
Incompleteness: đźš§ đźš§ đźš§

Techniques to improve the failure modes of your estimates. Surprisingly rarely used despite being fairly straightforward.

This is more-or-less a frequentist project.

Bayesians seem to claim to achieve robustness largely by choosing heavy-tailed priors where they might have chosen light-tailed ones, e.g.Â Laplacian priors instead of Gaussian ones. Such priors may have arbitrary parameters, but not more arbitrary than usual in Bayesian statistics and therefore do not attract so much need to rationalise away the guilt.

## TODO

• relation to penalized regression.

• connection with Lasso.

• Beranâ€™s Hellinger-ball contamination model, which I also donâ€™t yet understand.

• Breakdown point explanation

• glm connection.

## Corruption models

• (Adversarial) total variation $$\epsilon$$-corruption.

• Random (mixture) corruption

• other?

## M-estimation with robust loss

The one that I, at least, would think of when considering robust estimation.

In M-estimation, instead of hunting an maximum of the likelihood function as you do in maximum likelihood, or an minimum of the sum of squared residuals, as you do in least-squares estimation, you minimised a specifically choses loss funciton for those residuals. You may select an objective function more robust to deviations between your model and reality. Credited to Huber (Hube64).

See M-estimation for the details

Aside: AFAICT, the definition of M-estimation includes the possibility that you could in principle select a less-robust loss function than least sum-of-squares or negative log likelihood, but I have not seen this in the literature. Generally, some robustified approach is presumed.

For M-estimation as robust estimation, various complications ensue, such as the different between noise in your predictors, noise in your regressors, and whether the â€śtrueâ€ť model is included in your class, and which of these difficulties you have resolved or not.

Loosely speaking, no, you havenâ€™t solved problems of noise in your predictors, only the problem of noise in your responses.

And the cost is that you now have a loss function with some extra arbitrary parameters in which you have to justify, which is anathema to frequentists, who like to claim to be less arbitrary than Bayesians. You then have to justify why you chose that loss function and its particular parameterisation. There are various procedures to choose these parameters, based on scale estimation.

## MM-estimation

đźš§ Donâ€™t know

## Median-based estimators

Rousseeuw and Yohaiâ€™s school. (RoYo84)

Many permutations on the theme here, but it rapidly gets complex. The only one of these families I have looked into are the near trivial cases of the Least Median Of Squares and Least Trimmed Squares estimations. (Rous84)

More broadly we should also consider S-estimators, which do something withâ€¦ robust estimation of scale and using this to do robust estimation of location? đźš§

Theil-Sen-(Oja) estimators: Something about medians of inferred regression slopes. đźš§

Tukey median, and why no-one uses it what with it being NP-Hard.

## Others

RANSAC â€“ some kind of randomised outlier detection estimator. đźš§

# Refs

Barndorff-Nielsen, O. 1983. â€śOn a Formula for the Distribution of the Maximum Likelihood Estimator.â€ť Biometrika 70 (2): 343â€“65. https://doi.org/10.1093/biomet/70.2.343.

Beran, Rudolf. 1981. â€śEfficient Robust Estimates in Parametric Models.â€ť Zeitschrift FĂĽr Wahrscheinlichkeitstheorie Und Verwandte Gebiete 55 (1): 91â€“108. https://doi.org/10.1007/BF01013463.

â€”â€”â€”. 1982. â€śRobust Estimation in Models for Independent Non-Identically Distributed Data.â€ť The Annals of Statistics 10 (2): 415â€“28. https://doi.org/10.1214/aos/1176345783.

Bickel, P. J. 1975. â€śOne-Step Huber Estimates in the Linear Model.â€ť Journal of the American Statistical Association 70 (350): 428â€“34. https://doi.org/10.1080/01621459.1975.10479884.

Bondell, Howard D., Arun Krishna, and Sujit K. Ghosh. 2010. â€śJoint Variable Selection for Fixed and Random Effects in Linear Mixed-Effects Models.â€ť Biometrics 66 (4): 1069â€“77. https://doi.org/10.1111/j.1541-0420.2010.01391.x.

Burman, P., and D. Nolan. 1995. â€śA General Akaike-Type Criterion for Model Selection in Robust Regression.â€ť Biometrika 82 (4): 877â€“86. https://doi.org/10.1093/biomet/82.4.877.

BĂĽhlmann, Peter. 2014. â€śRobust Statistics.â€ť In Selected Works of Peter J. Bickel, edited by Jianqing Fan, Yaâ€™acov Ritov, and C. F. Jeff Wu, 51â€“98. Selected Works in Probability and Statistics 13. Springer New York. http://link.springer.com/chapter/10.1007/978-1-4614-5544-8_2.

Cantoni, Eva, and Elvezio Ronchetti. 2001. â€śRobust Inference for Generalized Linear Models.â€ť Journal of the American Statistical Association 96 (455): 1022â€“30. https://doi.org/10.1198/016214501753209004.

Charikar, Moses, Jacob Steinhardt, and Gregory Valiant. 2016. â€śLearning from Untrusted Data,â€ť November. http://arxiv.org/abs/1611.02315.

Cox, D. R. 1983. â€śSome Remarks on Overdispersion.â€ť Biometrika 70 (1): 269â€“74. https://doi.org/10.1093/biomet/70.1.269.

Czellar, Veronika, and Elvezio Ronchetti. 2010. â€śAccurate and Robust Tests for Indirect Inference.â€ť Biometrika 97 (3): 621â€“30. https://doi.org/10.1093/biomet/asq040.

Diakonikolas, Ilias, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2016. â€śRobust Estimators in High Dimensions Without the Computational Intractability,â€ť April. http://arxiv.org/abs/1604.06443.

Diakonikolas, Ilias, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2017. â€śBeing Robust (in High Dimensions) Can Be Practical,â€ť March. http://arxiv.org/abs/1703.00893.

Donoho, David L., and Peter J. Huber. 1983. â€śThe Notion of Breakdown Point.â€ť A Festschrift for Erich L. Lehmann 157184. https://books.google.ch/books?hl=en&lr=&id=H8QdaAPW3c8C&oi=fnd&pg=PA157&dq=donoho+huber+1983+breakdown+point&ots=I38CG8Bt_-&sig=BKoPX6T8T3r_qwPzmCjWY96PsKI.

Donoho, David L., and Richard C. Liu. 1988. â€śThe "Automatic" Robustness of Minimum Distance Functionals.â€ť The Annals of Statistics 16 (2): 552â€“86. https://doi.org/10.1214/aos/1176350820.

Donoho, David L., and Andrea Montanari. 2013. â€śHigh Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing,â€ť October. http://arxiv.org/abs/1310.7320.

Duchi, John, Peter Glynn, and Hongseok Namkoong. 2016. â€śStatistics of Robust Optimization: A Generalized Empirical Likelihood Approach,â€ť October. http://arxiv.org/abs/1610.03425.

Genton, Marc G, and Elvezio Ronchetti. 2003. â€śRobust Indirect Inference.â€ť Journal of the American Statistical Association 98 (461): 67â€“76. https://doi.org/10.1198/016214503388619102.

Ghosh, Abhik, and Ayanendranath Basu. 2016. â€śGeneral Model Adequacy Tests and Robust Statistical Inference Based on A New Family of Divergences,â€ť November. http://arxiv.org/abs/1611.05224.

Golubev, Grigori K., and Michael Nussbaum. 1990. â€śA Risk Bound in Sobolev Class Regression.â€ť The Annals of Statistics 18 (2): 758â€“78. https://doi.org/10.1214/aos/1176347624.

Hampel, Frank R. 1974. â€śThe Influence Curve and Its Role in Robust Estimation.â€ť Journal of the American Statistical Association 69 (346): 383â€“93. https://doi.org/10.1080/01621459.1974.10482962.

Hampel, Frank R., Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. 2011. Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons. http://books.google.com?id=XK3uhrVefXQC.

Huber, Peter J. 1964. â€śRobust Estimation of a Location Parameter.â€ť The Annals of Mathematical Statistics 35 (1): 73â€“101. https://doi.org/10.1214/aoms/1177703732.

â€”â€”â€”. 2009. Robust Statistics. 2nd ed. Wiley Series in Probability and Statistics. Hoboken, N.J: Wiley.

JankovĂˇ, Jana, and Sara van de Geer. 2016. â€śConfidence Regions for High-Dimensional Generalized Linear Models Under Sparsity,â€ť October. http://arxiv.org/abs/1610.01353.

Konishi, Sadanori, and G. Kitagawa. 2008. Information Criteria and Statistical Modeling. Springer Series in Statistics. New York: Springer.

Konishi, Sadanori, and Genshiro Kitagawa. 1996. â€śGeneralised Information Criteria in Model Selection.â€ť Biometrika 83 (4): 875â€“90. https://doi.org/10.1093/biomet/83.4.875.

â€”â€”â€”. 2003. â€śAsymptotic Theory for Information Criteria in Model Selectionâ€”Functional Approach.â€ť Journal of Statistical Planning and Inference, C.R. Rao 80th Birthday Felicitation vol., Part IV, 114 (1â€“2): 45â€“61. https://doi.org/10.1016/S0378-3758(02)00462-7.

Krzakala, Florent, Cristopher Moore, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka ZdeborovĂˇ, and Pan Zhang. 2013. â€śSpectral Redemption in Clustering Sparse Networks.â€ť Proceedings of the National Academy of Sciences 110 (52): 20935â€“40. https://doi.org/10.1073/pnas.1312486110.

Li, Jerry. 2017. â€śRobust Sparse Estimation Tasks in High Dimensions,â€ť February. http://arxiv.org/abs/1702.05860.

LU, W., Y. GOLDBERG, and J. P. FINE. 2012. â€śOn the Robustness of the Adaptive Lasso to Model Misspecification.â€ť Biometrika 99 (3): 717â€“31. https://doi.org/10.1093/biomet/ass027.

Machado, JosĂ© A.F. 1993. â€śRobust Model Selection and M-Estimation.â€ť Econometric Theory 9 (03): 478â€“93. https://doi.org/10.1017/S0266466600007775.

Manton, J. H., V. Krishnamurthy, and H. V. Poor. 1998. â€śJames-Stein State Filtering Algorithms.â€ť IEEE Transactions on Signal Processing 46 (9): 2431â€“47. https://doi.org/10.1109/78.709532.

Markatou, M., and E. Ronchetti. 1997. â€ś3 Robust Inference: The Approach Based on Influence Functions.â€ť In Handbook of Statistics, edited by BT - Handbook of Statistics, 15:49â€“75. Robust Inference. Elsevier. https://doi.org/10.1016/S0169-7161(97)15005-2.

Maronna, Ricardo A., Douglas Martin, and VĂ­ctor J. Yohai. 2006. Robust Statistics: Theory and Methods. Reprinted with corr. Wiley Series in Probability and Statistics. Chichester: Wiley.

Maronna, Ricardo Antonio. 1976. â€śRobust M-Estimators of Multivariate Location and Scatter.â€ť The Annals of Statistics 4 (1): 51â€“67. http://ssg.mit.edu/group/ajkim/area_exam/papers/Maronna_1976.pdf.gz.

Maronna, Ricardo A., and VĂ­ctor J. Yohai. 1995. â€śThe Behavior of the Stahel-Donoho Robust Multivariate Estimator.â€ť Journal of the American Statistical Association 90 (429): 330â€“41. https://doi.org/10.1080/01621459.1995.10476517.

â€”â€”â€”. 2014. â€śRobust Estimation of Multivariate Location and Scatter.â€ť In Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd. http://onlinelibrary.wiley.com/doi/10.1002/9781118445112.stat01520.pub2/abstract.

Maronna, Ricardo A., and Ruben H. Zamar. 2002. â€śRobust Estimates of Location and Dispersion for High-Dimensional Datasets.â€ť Technometrics 44 (4): 307â€“17. http://amstat.tandfonline.com/doi/abs/10.1198/004017002188618509.

Massart, Desire L., Leonard Kaufman, Peter J. Rousseeuw, and Annick Leroy. 1986. â€śLeast Median of Squares: A Robust Method for Outlier and Model Error Detection in Regression and Calibration.â€ť Analytica Chimica Acta 187 (January): 171â€“79. https://doi.org/10.1016/S0003-2670(00)82910-4.

Mossel, Elchanan, Joe Neeman, and Allan Sly. 2013. â€śA Proof of the Block Model Threshold Conjecture,â€ť November. http://arxiv.org/abs/1311.4115.

â€”â€”â€”. 2016. â€śBelief Propagation, Robust Reconstruction and Optimal Recovery of Block Models.â€ť The Annals of Applied Probability 26 (4): 2211â€“56. https://doi.org/10.1214/15-AAP1145.

Oja, Hannu. 1983. â€śDescriptive Statistics for Multivariate Distributions.â€ť Statistics & Probability Letters 1 (6): 327â€“32. https://doi.org/10.1016/0167-7152(83)90054-8.

Qian, Guoqi, and R. K. Hans. 1996. â€śSome Notes on Rissanenâ€™s Stochastic Complexity.â€ť http://www.researchgate.net/profile/Guoqi_Qian/publication/3079465_Some_notes_on_Rissanen's_stochastic_complexity/links/00463520f29b72ddd3000000.pdf.

Qian, Guoqi, and Hans R. KĂĽnsch. 1998. â€śOn Model Selection via Stochastic Complexity in Robust Linear Regression.â€ť Journal of Statistical Planning and Inference 75 (1): 91â€“116. https://doi.org/10.1016/S0378-3758(98)00138-4.

Ronchetti, E. 2000. â€śRobust Regression Methods and Model Selection.â€ť In Data Segmentation and Model Selection for Computer Vision, edited by Alireza Bab-Hadiashar and David Suter, 31â€“40. Springer New York. https://doi.org/10.1007/978-0-387-21528-0_2.

Ronchetti, Elvezio. 1985. â€śRobust Model Selection in Regression.â€ť Statistics & Probability Letters 3 (1): 21â€“23. https://doi.org/10.1016/0167-7152(85)90006-9.

â€”â€”â€”. 1997. â€śRobust Inference by Influence Functions.â€ť Journal of Statistical Planning and Inference, Robust Statistics and Data Analysis, Part I, 57 (1): 59â€“72. https://doi.org/10.1016/S0378-3758(96)00036-5.

Ronchetti, Elvezio, and Fabio Trojani. 2001. â€śRobust Inference with GMM Estimators.â€ť Journal of Econometrics 101 (1): 37â€“69. https://doi.org/10.1016/S0304-4076(00)00073-7.

Rousseeuw, Peter J. 1984. â€śLeast Median of Squares Regression.â€ť Journal of the American Statistical Association 79 (388): 871â€“80. https://doi.org/10.1080/01621459.1984.10477105.

Rousseeuw, Peter J., and Annick M. Leroy. 1987. Robust Regression and Outlier Detection. Wiley Series in Probability and Mathematical Statistics. New York: Wiley.

Rousseeuw, P., and V. Yohai. 1984. â€śRobust Regression by Means of S-Estimators.â€ť In Robust and Nonlinear Time Series Analysis, edited by JĂĽrgen Franke, Wolfgang HĂ¤rdle, and Douglas Martin, 256â€“72. Lecture Notes in Statistics 26. Springer US. https://doi.org/10.1007/978-1-4615-7821-5_15.

Royall, Richard M. 1986. â€śModel Robust Confidence Intervals Using Maximum Likelihood Estimators.â€ť International Statistical Review / Revue Internationale de Statistique 54 (2): 221â€“26. https://doi.org/10.2307/1403146.

Stigler, Stephen M. 2010. â€śThe Changing History of Robustness.â€ť The American Statistician 64 (4): 277â€“81. https://doi.org/10.1198/tast.2010.10159.

Tharmaratnam, Kukatharmini, and Gerda Claeskens. 2013. â€śA Comparison of Robust Versions of the AIC Based on M-, S- and MM-Estimators.â€ť Statistics 47 (1): 216â€“35. https://doi.org/10.1080/02331888.2011.568120.

Theil, Henri. 1992. â€śA Rank-Invariant Method of Linear and Polynomial Regression Analysis.â€ť In Henri Theilâ€™s Contributions to Economics and Econometrics, edited by Baldev Raj and Johan Koerts, 345â€“81. Advanced Studies in Theoretical and Applied Econometrics 23. Springer Netherlands. https://doi.org/10.1007/978-94-011-2546-8_20.

Tsou, Tsung-Shan. 2006. â€śRobust Poisson Regression.â€ť Journal of Statistical Planning and Inference 136 (9): 3173â€“86. https://doi.org/10.1016/j.jspi.2004.12.008.

Wedderburn, R. W. M. 1974. â€śQuasi-Likelihood Functions, Generalized Linear Models, and the Gaussâ€”Newton Method.â€ť Biometrika 61 (3): 439â€“47. https://doi.org/10.1093/biomet/61.3.439.

Xu, H., C. Caramanis, and S. Mannor. 2010. â€śRobust Regression and Lasso.â€ť IEEE Transactions on Information Theory 56 (7): 3561â€“74. https://doi.org/10.1109/TIT.2010.2048503.

Yang, Wenzhuo, and Huan Xu. 2013. â€śA Unified Robust Regression Model for Lasso-Like Algorithms.â€ť In ICML (3), 585â€“93. http://www.jmlr.org/proceedings/papers/v28/yang13e.pdf.