The Living Thing / Notebooks :

Model interpretation

The meeting point of differential privacy, accountability, interpretability, the tank detection story, clever horses in machine learning. Closely related: are the models what you would call fair?

Much work here; I understand little of it at the moment, but I keep needing to refer to papers here.


Aggarwal, C. C., & Yu, P. S.(2008) A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In C. C. Aggarwal & P. S. Yu (Eds.), Privacy-Preserving Data Mining (pp. 11–52). Springer US DOI.
Alain, G., & Bengio, Y. (2016) Understanding intermediate layers using linear classifier probes. ArXiv:1610.01644 [Cs, Stat].
Barocas, S., & Selbst, A. D.(2016) Big Data’s Disparate Impact (SSRN Scholarly Paper No. ID 2477899). . Rochester, NY: Social Science Research Network
Burrell, J. (2016) How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 2053951715622512. DOI.
Chipman, H. A., & Gu, H. (2005) Interpretable dimension reduction. Journal of Applied Statistics, 32(9), 969–987. DOI.
Choi, K., Fazekas, G., & Sandler, M. (2016) Explaining Deep Convolutional Neural Networks on Music Classification. ArXiv:1607.02444 [Cs].
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012) Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (pp. 214–226). New York, NY, USA: ACM DOI.
Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015) Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259–268). New York, NY, USA: ACM DOI.
Hardt, M., Price, E., & Srebro, N. (2016) Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (pp. 3315–3323).
Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., & Schölkopf, B. (2017) Avoiding Discrimination through Causal Reasoning. ArXiv:1706.02744 [Cs, Stat].
Lash, M. T., Lin, Q., Street, W. N., Robinson, J. G., & Ohlmann, J. (2016) Generalized Inverse Classification. ArXiv:1610.01675 [Cs, Stat].
Lipton, Z. C.(2016) The Mythos of Model Interpretability. In arXiv:1606.03490 [cs, stat].
Moosavi-Dezfooli, S.-M., Fawzi, A., Fawzi, O., & Frossard, P. (2016) Universal adversarial perturbations. ArXiv:1610.08401 [Cs, Stat].
Nguyen, A., Yosinski, J., & Clune, J. (2016) Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks. ArXiv Preprint ArXiv:1602.03616.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016) “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. (pp. 1135–1144). ACM Press DOI.
Sweeney, L. (2013) Discrimination in Online Ad Delivery. Queue, 11(3), 10:10–10:29. DOI.
Wisdom, S., Powers, T., Pitton, J., & Atlas, L. (2016) Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery. In Advances in Neural Information Processing Systems 29.
Wu, X., & Zhang, X. (2016) Automated Inference on Criminality using Face Images. ArXiv:1611.04135 [Cs].
Zemel, R., Wu, Y., Swersky, K., Pitassi, T., & Dwork, C. (2013) Learning Fair Representations. In Proceedings of the 30th International Conference on Machine Learning (ICML-13) (pp. 325–333).