Learning the graph structure, not just the clique potentials.
Much more work.
Learning these models turns out to need a conditional independence test, an awareness of multiple testing and graphs.
- bnlearn learns belief networks
A new R package for learning sparse Bayesian networks and other graphical models from high-dimensional data via sparse regularization. Designed from the ground up to handle:
- Experimental data with interventions
- Mixed observational / experimental data
- High-dimensional data with p >> n
- Datasets with thousands of variables (tested up to p=8000)
- Continuous and discrete data
The emphasis of this package is scalability and statistical consistency on high-dimensional datasets. […] For more details on this package, including worked examples and the methodological background, please see our new preprint.
The main methods for learning graphical models are:
- estimate.dag for directed acyclic graphs (Bayesian networks).
- estimate.precision for undirected graphs (Markov random fields).
- estimate.covariance for covariance matrices.
Currently, estimation of precision and covariances matrices is limited to Gaussian data.
- Nonparanormal skeptic (TBD.)
- Bunt96: W.L. Buntine (1996) A guide to the literature on learning probabilistic networks from data. IEEE Transactions on Knowledge and Data Engineering, 8(2), 195–210. DOI
- RGSG17: Joseph Ramsey, Madelyn Glymour, Ruben Sanchez-Romero, Clark Glymour (2017) A million variables and more: the Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. International Journal of Data Science and Analytics, 3(2), 121–129. DOI
- KoDV17: Murat Kocaoglu, Alex Dimakis, Sriram Vishwanath (2017) Cost-Optimal Learning of Causal Graphs. In PMLR (pp. 1875–1884).
- LeGK06: Su-In Lee, Varun Ganapathi, Daphne Koller (2006) Efficient Structure Learning of Markov Networks using \(L_1\)-Regularization. In Advances in neural Information processing systems (pp. 817–824). MIT Press
- ScMB14: Jürg Schelldorfer, Lukas Meier, Peter Bühlmann (2014) GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using ℓ1-Penalization. Journal of Computational and Graphical Statistics, 23(2), 460–477. DOI
- Cai17: T. Tony Cai (2017) Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures. Annual Review of Statistics and Its Application, 4(1), 423–446. DOI
- Mont12: Andrea Montanari (2012) Graphical models concepts in compressed sensing. Compressed Sensing: Theory and Applications, 394–438.
- CoBa17: D. R. Cox, H. S. Battey (2017) Large numbers of explanatory variables, a semi-descriptive analysis. Proceedings of the National Academy of Sciences, 114(32), 8592–8595. DOI
- NeOt04: Richard E. Neapolitan, others (2004) Learning bayesian networks (Vol. 38). Prentice Hall Upper Saddle River
- HiOB05: Geoffrey E. Hinton, Simon Osindero, Kejie Bao (2005) Learning causally linked markov random fields. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 128–135). Citeseer
- GoWD10: Vibhav Gogate, William Webb, Pedro Domingos (2010) Learning efficient Markov networks. In Advances in Neural Information Processing Systems (pp. 748–756).
- TeIL15: Johannes Textor, Alexander Idelberger, Maciej Liśkiewicz (2015) Learning from Pairwise Marginal Independencies. ArXiv:1508.00280 [Cs].
- WuSN12: Rui Wu, R. Srikant, Jian Ni (2012) Learning graph structures in discrete Markov random fields. In INFOCOM Workshops (pp. 214–219).
- CMKR12: Diego Colombo, Marloes H. Maathuis, Markus Kalisch, Thomas S. Richardson (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, 40(1), 294–321.
- Khos12: Ehsan Khoshgnauz (2012) Learning Markov Network Structure using Brownian Distance Covariance. ArXiv:1206.6361 [Cs, Stat].
- FuZh13: Fei Fu, Qing Zhou (2013) Learning Sparse Causal Gaussian Networks With Experimental Intervention: Regularization and Coordinate Descent. Journal of the American Statistical Association, 108(501), 288–300. DOI
- HaLB15: David Hallac, Jure Leskovec, Stephen Boyd (2015) Network Lasso: Clustering and Optimization in Large Graphs. ArXiv:1507.00280 [Cs, Math, Stat]. DOI
- HaDr13: Naftali Harris, Mathias Drton (2013) PC Algorithm for Nonparanormal Graphical Models. Journal of Machine Learning Research, 14(1), 3365–3383.
- KrSB09: Nicole Krämer, Juliane Schäfer, Anne-Laure Boulesteix (2009) Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics, 10(1), 384. DOI
- FrHT08: Jerome Friedman, Trevor Hastie, Robert Tibshirani (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432–441. DOI
- BüGe11: Peter Bühlmann, Sara van de Geer (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications. Heidelberg ; New York: Springer
- DrMa17: Mathias Drton, Marloes H. Maathuis (2017) Structure Learning in Graphical Modeling. Annual Review of Statistics and Its Application, 4(1), 365–393. DOI
- MKGT12: Vikash Mansinghka, Charles Kemp, Thomas Griffiths, Joshua Tenenbaum (2012) Structured Priors for Structure Learning. ArXiv:1206.6852.
- MaHa12: Rahul Mazumder, Trevor Hastie (2012) The graphical lasso: New insights and alternatives. Electronic Journal of Statistics, 6, 2125–2149. DOI
- ZLRL12: Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, Larry Wasserman (2012) The Huge Package for High-dimensional Undirected Graph Estimation in R. Journal of Machine Learning Research : JMLR, 13, 1059–1062.
- BaMo12: M. Bayati, A. Montanari (2012) The LASSO Risk for Gaussian Matrices. IEEE Transactions on Information Theory, 58(4), 1997–2017. DOI
- LiLW09: Han Liu, John Lafferty, Larry Wasserman (2009) The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs. Journal of Machine Learning Research, 10, 2295–2328.
- LHYL12: Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman (2012) The Nonparanormal SKEPTIC. ArXiv:1206.6488 [Cs, Stat].
- JuQM17: Alexander Jung, Nguyen Tran Quang, Alexandru Mara (2017) When is Network Lasso Accurate? ArXiv:1704.02107 [Stat].
- Geer14: Sara van de Geer (2014) Worst possible sub-directions in high-dimensional models. In arXiv:1403.7023 [math, stat] (Vol. 131).