Directed graphical models with the additional assumption that \(A\rightarrow B\) may be read as “A causes B”.

Observational studies, confounding, adjustment criteria, *d*-separation, confounding, identifiability, interventions…

When can I use my crappy observational data, collected without a good experimental design for whatever reason, to do interventional inference? There is a lot of research in this. I should summarise the salient bits for myself. In fact I did; I just did a reading group on this. See also quantum causal graphical models.

## Tutorials online

Tutorial: David Sontag and Uri Shalit, Causal inference from observational studies.

Felix Elwert’s summary is punchy. (Elwe13)

Chapter 3 of (some edition of) Pearl’s book is availalbe as an author’s preprint: Parts 1, 2, 3, 4, 5, 6.

## Counterfactuals

## Propensity scores

RuWa06 comes recommended by Shalizi as:

A good description of Rubin et al.’s methods for causal inference, adapted to the meanest understanding. […] Rubin and Waterman do a very good job of explaining, in a clear and concrete problem, just how and why the newer techniques of causal inference are valuable, with just enough technical detail that it doesn’t seem like magic.

## Causal Graph inference from data

Uh oh. You don’t know what causes what?
Or specifically, you can’t eliminate a whole bunch of potential causal arrows *a priori*?
Much more work.

## Causal time series DAGS

As with other time series methods, has its own issues.

Does this do it? find out. causal impact. Based on BGKR15.

The CausalImpact R package implements an approach to estimating the causal effect of a designed intervention on a time series. For example, how many additional daily clicks were generated by an advertising campaign? Answering a question like this can be difficult when a randomized experiment is not available. The package aims to address this difficulty using a structural Bayesian time-series model to estimate how the response metric might have evolved after the intervention if the intervention had not occurred.

## Questions

How does Granger causality relate?

## Refs

- ArGZ17
- Aragam, B., Gu, J., & Zhou, Q. (2017) Learning Large-Scale Bayesian Networks with the sparsebn Package.
*arXiv:1703.04025 [Cs, Stat]*. - ArMS09
- Aral, S., Muchnik, L., & Sundararajan, A. (2009) Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.
*Proceedings of the National Academy of Sciences*, 106(51), 21544–21549. DOI. - ArCS99
- Arnold, B. C., Castillo, E., & Sarabia, J. M.(1999) Conditional specification of statistical models. . Springer Science & Business Media
- AyPo08
- Ay, N., & Polani, D. (2008) Information flows in causal networks.
*Advances in Complex Systems (ACS)*, 11(01), 17–41. DOI. - BCCC17
- Bahadori, M. T., Chalupka, K., Choi, E., Chen, R., Stewart, W. F., & Sun, J. (2017) Neural Causal Regularization under the Independence of Mechanisms Assumption.
*arXiv:1702.02604 [Cs, Stat]*. - BaPe16
- Bareinboim, E., & Pearl, J. (2016) Causal inference and the data-fusion problem.
*Proceedings of the National Academy of Sciences*, 113(27), 7345–7352. DOI. - BaTP14
- Bareinboim, E., Tian, J., & Pearl, J. (2014) Recovering from Selection Bias in Causal and Statistical Inference. In AAAI (pp. 2410–2416).
- Beal03
- Beal, M. J.(2003) Variational algorithms for approximate Bayesian inference. . University of London
- BLZS15
- Bloniarz, A., Liu, H., Zhang, C.-H., Sekhon, J., & Yu, B. (2015) Lasso adjustments of treatment effect estimates in randomized experiments.
*arXiv:1507.03652 [Math, Stat]*. - BPSM16
- Bongers, S., Peters, J., Schölkopf, B., & Mooij, J. M.(2016) Structural Causal Models: Cycles, Marginalizations, Exogenous Reparametrizations and Reductions.
*arXiv:1611.06221 [Cs, Stat]*. - BGKR15
- Brodersen, K. H., Gallusser, F., Koehler, J., Remy, N., & Scott, S. L.(2015) Inferring causal impact using Bayesian structural time-series models.
*The Annals of Applied Statistics*, 9(1), 247–274. DOI. - Bühl13
- Bühlmann, P. (2013) Causal statistical inference in high dimensions.
*Mathematical Methods of Operations Research*, 77(3), 357–370. - BüKM14
- Bühlmann, P., Kalisch, M., & Meier, L. (2014) High-Dimensional Statistics with a View Toward Applications in Biology.
*Annual Review of Statistics and Its Application*, 1(1), 255–278. DOI. - BPEM14
- Bühlmann, P., Peters, J., Ernest, J., & Maathuis, M. (2014) Predicting causal effects in high-dimensional settings.
- BüRK13
- Bühlmann, P., Rütimann, P., & Kalisch, M. (2013) Controlling false positive selections in high-dimensional regression and causal inference.
*Statistical Methods in Medical Research*, 22(5), 466–492. - ChPe12
- Chen, B., & Pearl, J. (2012) Regression and causation: A critical examination of econometric textbooks.
- ClMH14
- Claassen, T., Mooij, J. M., & Heskes, T. (2014) Proof Supplement - Learning Sparse Causal Models is not NP-hard (UAI2013).
*arXiv:1411.1557 [Stat]*. - CMKR12
- Colombo, D., Maathuis, M. H., Kalisch, M., & Richardson, T. S.(2012) Learning high-dimensional directed acyclic graphs with latent and selection variables.
*The Annals of Statistics*, 40(1), 294–321. - DeWR11
- De Luna, X., Waernbaum, I., & Richardson, T. S.(2011) Covariate selection for the nonparametric estimation of an average treatment effect.
*Biometrika*, asr041. DOI. - Dide00
- Didelez, V. (n.d.) Causal Reasoning for Events in Continuous Time: A Decision–Theoretic Approach.
- DEMS10
- Duvenaud, D. K., Eaton, D., Murphy, K. P., & Schmidt, M. W.(2010) Causal learning without DAGs. In NIPS Causality: Objectives and Assessment (pp. 177–190).
- Eich01
- Eichler, M. (2001) Granger-causality graphs for multivariate time series.
*Granger-Causality Graphs for Multivariate Time Series*. - Elwe13
- Elwert, F. (2013) Graphical causal models. In Handbook of causal analysis for social research (pp. 245–273). Springer
- EnHS13
- Entner, D., Hoyer, P., & Spirtes, P. (2013) Data-driven covariate selection for nonparametric estimation of causal effects. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics (pp. 256–264).
- ErBü14
- Ernest, J., & Bühlmann, P. (2014) Marginal integration for fully robust causal inference.
*arXiv:1405.1868 [Stat]*. - Fixx77
- Fixx, J. F.(1977) Games for the superintelligent. . London: Muller
- FuZh13
- Fu, F., & Zhou, Q. (2013) Learning Sparse Causal Gaussian Networks With Experimental Intervention: Regularization and Coordinate Descent.
*Journal of the American Statistical Association*, 108(501), 288–300. DOI. - Gelm10
- Gelman, A. (2010) Causality and statistical learning.
*American Journal of Sociology*, 117(3), 955–966. DOI. - GuFZ14
- Gu, J., Fu, F., & Zhou, Q. (2014) Adaptive Penalized Estimation of Directed Acyclic Graphs From Categorical Data.
*arXiv:1403.2310 [Stat]*. - HiOB05
- Hinton, G. E., Osindero, S., & Bao, K. (2005) Learning causally linked markov random fields. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 128–135). Citeseer
- Jord99
- Jordan, Michael Irwin. (1999) Learning in graphical models. . Cambridge, Mass.: MIT Press
- JoWe02a
- Jordan, Michael I., & Weiss, Y. (2002a) Graphical models: Probabilistic inference.
*The Handbook of Brain Theory and Neural Networks*, 490–496. - JoWe02b
- Jordan, Michael I., & Weiss, Y. (2002b) Probabilistic inference in graphical models.
*Handbook of Neural Networks and Brain Theory*. - KaBü07
- Kalisch, M., & Bühlmann, P. (2007) Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm.
*Journal of Machine Learning Research*, 8, 613–636. - Kenn15
- Kennedy, E. H.(2015) Semiparametric theory and empirical processes in causal inference.
*arXiv Preprint arXiv:1510.04740*. - KiPe83
- Kim, J. H., & Pearl, J. (1983) A computational model for causal and diagnostic reasoning in inference systems. In IJCAI (Vol. 83, pp. 190–193). Citeseer
- KoFr09
- Koller, D., & Friedman, N. (2009) Probabilistic graphical models : principles and techniques. . Cambridge, MA: MIT Press
- LaSp88
- Lauritzen, S. L., & Spiegelhalter, D. J.(1988) Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems.
*Journal of the Royal Statistical Society. Series B (Methodological)*, 50(2), 157–224. - Laur96
- Lauritzen, Steffen L. (1996) Graphical Models. . Clarendon Press
- Laur00
- Lauritzen, Steffen L. (2000) Causal inference from graphical models. In Complex stochastic systems (pp. 63–107). CRC Press
- LNCS16
- Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., & Bottou, L. (2016) Discovering Causal Signals in Images.
*arXiv:1605.08179 [Cs, Stat]*. - MaCo13
- Maathuis, M. H., & Colombo, D. (2013) A generalized backdoor criterion.
*arXiv Preprint arXiv:1307.5636*. - MCKB10
- Maathuis, M. H., Colombo, D., Kalisch, M., & Bühlmann, P. (2010) Predicting causal effects in large-scale systems from observational data.
*Nature Methods*, 7(4), 247–248. DOI. - MaKB09
- Maathuis, M. H., Kalisch, M., & Bühlmann, P. (2009) Estimating high-dimensional intervention effects from observational data.
*The Annals of Statistics*, 37(6A), 3133–3164. DOI. - MPSM10
- Marbach, D., Prill, R. J., Schaffter, T., Mattiussi, C., Floreano, D., & Stolovitzky, G. (2010) Revealing strengths and weaknesses of methods for gene network inference.
*Proceedings of the National Academy of Sciences*, 107(14), 6286–6291. DOI. - Mess12
- Messerli, F. H.(2012) Chocolate Consumption, Cognitive Function, and Nobel Laureates.
*New England Journal of Medicine*, 367(16), 1562–1564. DOI. - MiMo07
- Mihalkova, L., & Mooney, R. J.(2007) Bottom-up learning of Markov logic network structure. In Proceedings of the 24th international conference on Machine learning (pp. 625–632). ACM
- Mont11
- Montanari, A. (2011) Lecture Notes for Stat 375 Inference in Graphical Models.
- Murp12
- Murphy, K. P.(2012) Machine Learning: A Probabilistic Perspective. (1 edition.). Cambridge, MA: The MIT Press
- NeOt04
- Neapolitan, R. E., & others. (2004) Learning bayesian networks. (Vol. 38). Prentice Hall Upper Saddle River
- NoNy11
- Noel, H., & Nyhan, B. (2011) The “unfriending” problem: The consequences of homophily in friendship retention for causal estimates of social influence.
*Social Networks*, 33(3), 211–218. DOI. - Pear82
- Pearl, J. (1982) Reverend Bayes on inference engines: a distributed hierarchical approach. In in Proceedings of the National Conference on Artificial Intelligence (pp. 133–136).
- Pear86
- Pearl, J. (1986) Fusion, propagation, and structuring in belief networks.
*Artificial Intelligence*, 29(3), 241–288. DOI. - Pear08
- Pearl, J. (2008) Probabilistic reasoning in intelligent systems: networks of plausible inference. (Rev. 2. print., 12. [Dr.].). San Francisco, Calif: Kaufmann
- Pear09a
- Pearl, J. (2009a) Causal inference in statistics: An overview.
*Statistics Surveys*, 3, 96–146. DOI. - Pear09b
- Pearl, J. (2009b) Causality: Models, Reasoning and Inference. . Cambridge University Press
- PeBa14
- Pearl, J., & Bareinboim, E. (2014) External Validity: From Do-Calculus to Transportability Across Populations.
*Statistical Science*, 29(4), 579–595. DOI. - PeBM15
- Peters, J., Bühlmann, P., & Meinshausen, N. (2015) Causal inference using invariant prediction: identification and confidence intervals.
*arXiv:1501.01332 [Stat]*. - Ragi11
- Raginsky, M. (2011) Directed information and Pearl’s causal calculus. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 958–965). DOI.
- RuWa06
- Rubin, D. B., & Waterman, R. P.(2006) Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology.
*Statistical Science*, 21(2), 206–222. DOI. - SaVa13
- Sauer, B., & VanderWeele, T. J.(2013) Use of Directed Acyclic Graphs. . Agency for Healthcare Research and Quality (US)
- Schm10
- Schmidt, M. (2010) Graphical model structure learning with l1-regularization. . UNIVERSITY OF BRITISH COLUMBIA
- SMFP15
- Schölkopf, B., Muandet, K., Fukumizu, K., & Peters, J. (2015) Computing Functions of Random Variables via Reproducing Kernel Hilbert Space Representations.
*arXiv:1501.06794 [Cs, Stat]*. - ShMc16
- Shalizi, C. R., & McFowland III, E. (2016) Controlling for Latent Homophily in Social Networks through Inferring Latent Locations.
*arXiv:1607.06565 [Physics, Stat]*. - ShTh11
- Shalizi, C. R., & Thomas, A. C.(2011) Homophily and Contagion Are Generically Confounded in Observational Social Network Studies.
*Sociological Methods & Research*, 40(2), 211–239. DOI. - ShPe08
- Shpitser, I., & Pearl, J. (2008) Complete identification methods for the causal hierarchy.
*The Journal of Machine Learning Research*, 9, 1941–1979. - ShTc14
- Shpitser, I., & Tchetgen, E. T.(2014) Causal Inference with a Graphical Hierarchy of Interventions.
*arXiv:1411.2127 [Stat]*. - SmEi08
- Smith, D. A., & Eisner, J. (2008) Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 145–156). Association for Computational Linguistics
- SpGS01
- Spirtes, P., Glymour, C., & Scheines, R. (2001) Causation, Prediction, and Search. (Second Edition.). The MIT Press
- TeIL15
- Textor, J., Idelberger, A., & Liśkiewicz, M. (2015) Learning from Pairwise Marginal Independencies.
*arXiv:1508.00280 [Cs]*. - VaBC12
- Vansteelandt, S., Bekaert, M., & Claeskens, G. (2012) On model selection and model misspecification in causal inference.
*Statistical Methods in Medical Research*, 21(1), 7–30. DOI. - ViCo14
- Visweswaran, S., & Cooper, G. F.(2014) Counting Markov Blanket Structures.
*arXiv:1407.2483 [Cs, Stat]*. - Wrig34
- Wright, S. (1934) The Method of Path Coefficients.
*The Annals of Mathematical Statistics*, 5(3), 161–215. DOI. - YPHS16
- Yadav, P., Prunelli, L., Hoff, A., Steinbach, M., Westra, B., Kumar, V., & Simon, G. (2016) Causal Inference in Observational Data.
*arXiv:1611.04660 [Cs, Stat]*. - YeFW03
- Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2003) Understanding Belief Propagation and Its Generalizations. In G. Lakemeyer & B. Nebel (Eds.), Exploring Artificial Intelligence in the New Millennium (pp. 239–236). Morgan Kaufmann Publishers
- ZPJS12
- Zhang, K., Peters, J., Janzing, D., & Schölkopf, B. (2012) Kernel-based Conditional Independence Test and Application in Causal Discovery.
*arXiv:1202.3775 [Cs, Stat]*.