# Differential privacy

Usefulness: đź”§
Novelty: đź’ˇ
Uncertainty: đź¤Ş đź¤Ş đź¤Ş
Incompleteness: đźš§ đźš§ đźš§

Another thing I wonâ€™t have time to blog or fully understand, but will collect a few explanatory blog posts about for emergency cribbing.

Letâ€™s say you wanted to count how many of your online friends were dogs, while respecting the maxim that, on the Internet, nobody should know youâ€™re a dog. To do this, you could ask each friend to answer the question â€śAre you a dog?â€ť in the following way. Each friend should flip a coin in secret, and answer the question truthfully if the coin came up heads; but, if the coin came up tails, that friend should always say â€śYesâ€ť regardless. Then you could get a good estimate of the true count from the greater-than-half fraction of your friends that answered â€śYesâ€ť. However, you still wouldnâ€™t know which of your friends was a dog: each answer â€śYesâ€ť would most likely be due to that friendâ€™s coin flip coming up tails.

NB this would need to be a weighted coin, or you donâ€™t learn anything.

This has recently become particularly publicly interesting because the US census has fingered mathematical differential privacy methods for preserving literal citizen privacy. This has spawned some good laypersonâ€™s introductions,

Alexandra Wood et al, Differential Privacy: A Primer for a Non-Technical Audience, and Mark Hansen has written an illustrated explanation.

There is a fun paper (Dimitrakakis et al. 2013) arguing that Bayesian posterior sampling has certain differential privacy guarantees.

Practical: see Googleâ€™s differential privacy library.

# Refs

Bassily, Raef, Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, and Jonathan Ullman. 2015. â€śAlgorithmic Stability for Adaptive Data Analysis,â€ť November. http://arxiv.org/abs/1511.02513.

Charles-Edouard, BrĂ©hier, Gazeau Maxime, GoudenĂ¨ge Ludovic, LeliĂ¨vre Tony, and Rousset Mathias. 2015. â€śUnbiasedness of Some Generalized Adaptive Multilevel Splitting Algorithms,â€ť May. http://arxiv.org/abs/1505.02674.

Dimitrakakis, Christos, Blaine Nelson, and Zuhe Zhang, Aikaterini Mitrokotsa, and Benjamin Rubinstein. 2013. â€śBayesian Differential Privacy Through Posterior Sampling,â€ť June. http://arxiv.org/abs/1306.1066.

Dwork, Cynthia. 2006. â€śDifferential Privacy.â€ť In. Vol. 4052. https://www.microsoft.com/en-us/research/publication/differential-privacy/.

Dwork, Cynthia, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Roth. 2015a. â€śThe Reusable Holdout: Preserving Validity in Adaptive Data Analysis.â€ť Science 349 (6248): 636â€“38. https://doi.org/10.1126/science.aaa9375.

Dwork, Cynthia, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Leon Roth. 2015b. â€śPreserving Statistical Validity in Adaptive Data Analysis.â€ť In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC â€™15, 117â€“26. Portland, Oregon, USA: ACM Press. https://doi.org/10.1145/2746539.2746580.

Fanti, Giulia, Vasyl Pihur, and Ăšlfar Erlingsson. 2015. â€śBuilding a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries,â€ť March. http://arxiv.org/abs/1503.01214.

Jung, Christopher, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld. 2019. â€śA New Analysis of Differential Privacyâ€™s Generalization Guarantees,â€ť September. http://arxiv.org/abs/1909.03577.

Wood, Alexandra, Micah Altman, Aaron Bembenek, Mark Bun, James Honaker, Kobbi Nissim, David R Oâ€™Brien, and Salil Vadhan. 2019. â€śDifferential Privacy: A Primer for a Non-Technical Audienceâ€ť 21: 68. http://www.jetlaw.org/journal-archives/volume-21/volume-21-issue-1/differential-privacy-a-primer-for-a-non-technical-audience/.

Zhang, Zuhe, Benjamin I. P. Rubinstein, and Christos Dimitrakakis. 2016. â€śOn the Differential Privacy of Bayesian Inference.â€ť In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2365â€“71. AAAIâ€™16. Phoenix, Arizona: AAAI Press. http://arxiv.org/abs/1512.06992.