Variational autoencoders

November 4, 2019 — September 10, 2020

approximation

Bayes

generative

optimization

probabilistic algorithms

probability

statistics

Figure 1: A variational autoencoder uses a limited latent distribution to approximate a complex posterior distribution

A method at the intersection of stochastic variational inference and probabilistic neural nets where we presume that the model is generated by a low-dimensional latent space, which is, if you squint at it, kind of the information bottleneck trick but in a probabilistic setting. To my mind it is a sorta-kinda nonparametric approximate Bayes method.

There is a lot more going on here than I have time to explain, let alone that which I cannot have not even understood for myself.

TBD: connection to reparameterization tricks.

To explore: Relative complexity of these methods e.g. how long does it take to train a variational autoencoder for a given task compared to a similarly expressive GAN?

For now, check out some of the many tutorials, e.g.

1 Incoming

Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images, Child 2020
Jaan Altosaar, Tutorial - What is a variational autoencoder?
Encoder Autonomy

This year I realized that VAEs are non-parametrically consistent as models of the observed data even when the encoder is held fixed and arbitrary. This is best demonstrated with a nonstandard derivation of VAEs bypassing the ELBO.

2 References

Abbasnejad, Dick, and Hengel. 2016. “Infinite Variational Autoencoder for Semi-Supervised Learning.” In Advances in Neural Information Processing Systems 29.

Ambrogioni, Güçlü, Güçlütürk, et al. 2018. “Wasserstein Variational Inference.” In Proceedings of the 32Nd International Conference on Neural Information Processing Systems. NIPS’18.

Arjovsky, Chintala, and Bottou. 2017. “Wasserstein Generative Adversarial Networks.” In International Conference on Machine Learning.

Bamler, and Mandt. 2017. “Structured Black Box Variational Inference for Latent Time Series Models.” arXiv:1707.01069 [Cs, Stat].

Bora, Jalal, Price, et al. 2017. “Compressed Sensing Using Generative Models.” In International Conference on Machine Learning.

Bowman, Vilnis, Vinyals, et al. 2015. “Generating Sentences from a Continuous Space.” arXiv:1511.06349 [Cs].

Burda, Grosse, and Salakhutdinov. 2016. “Importance Weighted Autoencoders.” In arXiv:1509.00519 [Cs, Stat].

Caterini, Doucet, and Sejdinovic. 2018. “Hamiltonian Variational Auto-Encoder.” In Advances in Neural Information Processing Systems.

Chen, Xi, Kingma, Salimans, et al. 2016. “Variational Lossy Autoencoder.” In PRoceedings of ICLR.

Chen, Tian Qi, Rubanova, Bettencourt, et al. 2018. “Neural Ordinary Differential Equations.” In Advances in Neural Information Processing Systems 31.

Chung, Kastner, Dinh, et al. 2015. “A Recurrent Latent Variable Model for Sequential Data.” In Advances in Neural Information Processing Systems 28.

Cremer, Li, and Duvenaud. 2018. “Inference Suboptimality in Variational Autoencoders.” arXiv:1801.03558 [Cs, Stat].

Cutajar, Bonilla, Michiardi, et al. 2017. “Random Feature Expansions for Deep Gaussian Processes.” In PMLR.

Dupont, Doucet, and Teh. 2019. “Augmented Neural ODEs.” arXiv:1904.01681 [Cs, Stat].

Fabius, and van Amersfoort. 2014. “Variational Recurrent Auto-Encoders.” In Proceedings of ICLR.

Garnelo, Schwarz, Rosenbaum, et al. 2018. “Neural Processes.”

Grathwohl, Chen, Bettencourt, et al. 2018. “FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models.” arXiv:1810.01367 [Cs, Stat].

Hegde, Heinonen, Lähdesmäki, et al. 2018. “Deep Learning with Differential Gaussian Process Flows.” arXiv:1810.04066 [Cs, Stat].

He, Spokoyny, Neubig, et al. 2019. “Lagging Inference Networks and Posterior Collapse in Variational Autoencoders.” In PRoceedings of ICLR.

Hoffman, and Johnson. 2016. “ELBO Surgery: Yet Another Way to Carve up the Variational Evidence Lower Bound.” In Advances In Neural Information Processing Systems.

Hsu, Zhang, and Glass. 2017. “Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data.” In arXiv:1709.07902 [Cs, Eess, Stat].

Huang, Krueger, Lacoste, et al. 2018. “Neural Autoregressive Flows.” arXiv:1804.00779 [Cs, Stat].

Husain, Nock, and Williamson. 2019. “A Primal-Dual Link Between GANs and Autoencoders.” In Advances in Neural Information Processing Systems.

Hu, Yang, Salakhutdinov, et al. 2018. “On Unifying Deep Generative Models.” In arXiv:1706.00550 [Cs, Stat].

Kim, Wiseman, Miller, et al. 2018. “Semi-Amortized Variational Autoencoders.” arXiv:1802.02550 [Cs, Stat].

Kingma, Diederik P. 2017. “Variational Inference & Deep Learning: A New Synthesis.”

Kingma, Durk P, and Dhariwal. 2018. “Glow: Generative Flow with Invertible 1x1 Convolutions.” In Advances in Neural Information Processing Systems 31.

Kingma, Diederik P., Salimans, Jozefowicz, et al. 2016. “Improving Variational Inference with Inverse Autoregressive Flow.” In Advances in Neural Information Processing Systems 29.

Kingma, Diederik P., Salimans, and Welling. 2015. “Variational Dropout and the Local Reparameterization Trick.” In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2. NIPS’15.

Kingma, Diederik P., and Welling. 2014. “Auto-Encoding Variational Bayes.” In ICLR 2014 Conference.

———. 2019. An Introduction to Variational Autoencoders. Foundations and Trends in Machine Learning.

Knop, Spurek, Tabor, et al. 2020. “Cramer-Wold Auto-Encoder.” Journal of Machine Learning Research.

Larsen, Sønderby, Larochelle, et al. 2015. “Autoencoding Beyond Pixels Using a Learned Similarity Metric.” arXiv:1512.09300 [Cs, Stat].

Lee, Ge, Ma, et al. 2017. “On the Ability of Neural Nets to Express Distributions.” In arXiv:1702.07028 [Cs].

Liang, Krishnan, Hoffman, et al. 2018. “Variational Autoencoders for Collaborative Filtering.” In Proceedings of the 2018 World Wide Web Conference. WWW ’18.

Louizos, Shalit, Mooij, et al. 2017. “Causal Effect Inference with Deep Latent-Variable Models.” In Advances in Neural Information Processing Systems 30.

Louizos, and Welling. 2017. “Multiplicative Normalizing Flows for Variational Bayesian Neural Networks.” In PMLR.

Luo, Agres, and Herremans. 2019. “Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders.” In Proceedings of the 20th Conference of the International Society for Music Information Retrieval.

Mathieu, Rainforth, Siddharth, et al. 2019. “Disentangling Disentanglement in Variational Autoencoders.” In International Conference on Machine Learning.

Ng, Fang, Zhu, et al. 2020. “Masked Gradient-Based Causal Structure Learning.” arXiv:1910.08527 [Cs, Stat].

Ng, Zhu, Chen, et al. 2019. “A Graph Autoencoder Approach to Causal Structure Learning.” In Advances In Neural Information Processing Systems.

Papamakarios, Murray, and Pavlakou. 2017. “Masked Autoregressive Flow for Density Estimation.” In Advances in Neural Information Processing Systems 30.

Rakesh, Guo, Moraffah, et al. 2018. “Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM ’18.

Rezende, and Mohamed. 2015. “Variational Inference with Normalizing Flows.” In International Conference on Machine Learning. ICML’15.

Rezende, Mohamed, and Wierstra. 2015. “Stochastic Backpropagation and Approximate Inference in Deep Generative Models.” In Proceedings of ICML.

Richter, Boustati, Nüsken, et al. 2020. “VarGrad: A Low-Variance Gradient Estimator for Variational Inference.”

Rippel, and Adams. 2013. “High-Dimensional Probability Estimation with Deep Density Models.” arXiv:1302.5125 [Cs, Stat].

Roberts, Engel, Raffel, et al. 2018. “A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music.” arXiv:1803.05428 [Cs, Eess, Stat].

Roeder, Grant, Phillips, et al. 2019. “Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems.” arXiv:1905.12090 [Cs, Stat].

Ruiz, Titsias, and Blei. 2016. “The Generalized Reparameterization Gradient.” In Advances In Neural Information Processing Systems.

Salimans, Kingma, and Welling. 2015. “Markov Chain Monte Carlo and Variational Inference: Bridging the Gap.” In Proceedings of the 32nd International Conference on Machine Learning (ICML-15). ICML’15.

Spantini, Bigoni, and Marzouk. 2017. “Inference via Low-Dimensional Couplings.” Journal of Machine Learning Research.

Tait, and Damoulas. 2020. “Variational Autoencoding of PDE Inverse Problems.” arXiv:2006.15641 [Cs, Stat].

Tran, Ranganath, and Blei. 2015. “The Variational Gaussian Process.” In Proceedings of ICLR.

Ullrich. 2020. “A Coding Perspective on Deep Latent Variable Models.”

van de Meent, Paige, Yang, et al. 2021. “An Introduction to Probabilistic Programming.” arXiv:1809.10756 [Cs, Stat].

van den Berg, Hasenclever, Tomczak, et al. 2018. “Sylvester Normalizing Flows for Variational Inference.” In UAI18.

Wang, and Wang. 2019. “Riemannian Normalizing Flow on Variational Wasserstein Autoencoder for Text Modeling.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).

Yang, Liu, Chen, et al. 2020. “CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models.” arXiv:2004.08697 [Cs, Stat].

Zahm, Constantine, Prieur, et al. 2018. “Gradient-Based Dimension Reduction of Multivariate Vector-Valued Functions.” arXiv:1801.07922 [Math].