I’ve a weakness for ideas that give me plausible deniability for making generative art while doing my maths homework.
Quasimondo: so do you
This page is more chaotic than the already-chaotic median, sorry. Good luck making sense of it.
See also analysis/resynthesis.
See gesture recognition. Oh and also google’s AMI channel, and ml4artists, which has some sweet machine learning for artists topic guides.
Many neural networks, are generative in the sense that even if you train ’em to classify things, they can also predict new members of the class. e.g. run the model forwards, it recognizes melodies; run it “backwards”, it composes melodies. Or rather, you maybe trained them to generate examples in the course of training them to detect examples.
There are many definitional and practical wrinkles here, and this quality is not unique to artificial neural networks, but it is a great convenience, and the gods of machine learning have blessed us with much infrastructure to exploit this feature, because it is very close to actual profitable algorithms. Upshot: There is now a lot of computation and grad student labour directed at producing neural networks which as a byproduct can produce faces, chairs, film dialogue, symphonies and so on.
There are NIPS streams about this now.
Some as-yet-unfiled neural-artwork links I should think about.
So simple it’s cute, CPPNs are probably what Jonathan McCabe has been producing for years.
IGAN, iGAN: Interactive Image Generation via Generative Adversarial Networks
neurogram is a compact semi-untrained neural network image synthesis-in-the-browser
Variational inference (Hint07, WiBi05, Giro01, MnGr14) looks exciting here, particularly in an autoencoder setting. (KiWe13)
- Ross Gibson Adventures in narrated reality gives an overview of text generation using RNNs.
See those classic images from google’s tripped-out image recognition systems) or Gatys, Ecker and Bethge’s deep art Neural networks do a passable undergraduate Monet.
Here’s Frank Liu’s implementation of style transfer in pycaffe.
Alex Graves, Generating Sequences With Recurrent Neural Networks, generates handwriting. Relatedly, sketch-rnn is reaaaally cute.
Deep dreaming approaches are entertaining (NSFW). Here’s a more pedestrian and slightly more informative version of that.
Distill.pub has some lovely visual explanations of visual and other neural networks:
Experiments in Handwriting with a Neural Network
Deconvolution and Checkerboard Artifacts
How to Use t-SNE Effectively
Attention and Augmented Recurrent Neural Networks
hardmaru presents an amazing introduction to running sophisticated neural networks in the browser, targeted at artists, which goes over the handwriting post in a non-technical way.
Symbolic composition via scores/MIDI/etc
Seems like it should be easy, until you think about it.
Related: Arpeggiate by numbers which discussed music-theory.
Google has weighed in, like a gorilla on the metallophone, to do midi composition with Tensorflow as part of their Magenta project. Their NIPS 2016 demo won the best demo prize.
Daniel Johnson has a convolutional and recurrent architecture for taking into account multiple types of dependency in music, which he calls biaxial neural network Zhe LI, Composing Music With Recurrent Neural Networks.
Ji-Sung Kim’s deepjazz project is minimal, but does interesting jazz improvisations. Part of the genius here is choosing totally chaotic music to try to ape, so you can ape it chaotically. (Code)
Boulanger-Lewandowski, (code and data) for BoBV12’s recurrent neural network composition using python/Theano. Christian Walder leads a project which shares some roots with that. (Wald16a, Wald16b) Bob Sturm’s FolkRNN does a related thing, but ingeniously redefines the problem by focussing on folk tune notation.
A tutorial on generating music using Restricted Boltzmann Machines for the conditional random field density, and an RNN for the time dependence after BoBV12.
Bob Sturm did a good one
TBD: google’s latest demo in this area was popular. Deep Bach (paper HaPa16, code) seems to be doing a related thing. Similar sets of authors (HaSP16) have some other related work):
Modeling polyphonic music is a particularly challenging task because of the intricate interplay between melody and harmony. A good model should satisfy three requirements: statistical accuracy (capturing faithfully the statistics of correlations at various ranges, horizontally and vertically), flexibility (coping with arbitrary user constraints), and generalization capacity (inventing new material, while staying in the style of the training corpus). Models proposed so far fail on at least one of these requirements. We propose a statistical model of polyphonic music, based on the maximum entropy principle. This model is able to learn and reproduce pairwise statistics between neighboring note events in a given corpus. The model is also able to invent new chords and to harmonize unknown melodies. We evaluate the invention capacity of the model by assessing the amount of cited, re-discovered, and invented chords on a corpus of Bach chorales. We discuss how the model enables the user to specify and enforce user-defined constraints, which makes it useful for style-based, interactive music generation.
Evan Chow represents for team non-deep-learning with jazzml:
Computer jazz improvisation powered by machine learning, specifically trigram modeling, K-Means clustering, and chord inference with SVMs.
See also analysis/resynthesis, voice face.
Matt Vitelli on music generation from MP3s (source).
Soundtracking audio from video.
Alex Graves on RNN predictive synthesis.
Parag Mittal on RNN style transfer.
Andy Sarrof, Musical Audio Synthesis Using Autoencoding Neural Nets. (code)
Neural style transfer for audio is crying out to be done, but I’ve only seen more traditional techniques. (UPDATE: It’s happening these days, but google it for yourself as I’m busy.)
Pixelrnn turns out to be good at music Dadabots have successfully weaponised samplernn and it’s cute.
Jlin and Holly Herndon](http://cdm.link/2018/12/jlin-holly-herndon-and-spawn-find-beauty-in-ais-flaws/) have a nice use of messed-up neural nets.
- SBBW15: Siddharth Sigtia, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d’Avila Garcez, Simon Dixon (2015) A hybrid recurrent neural network for music transcription. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2061–2065). IEEE
- DuSK16: Vincent Dumoulin, Jonathon Shlens, Manjunath Kudlur (2016) A Learned Representation For Artistic Style. ArXiv:1610.07629 [Cs].
- GaEB15: Leon A. Gatys, Alexander S. Ecker, Matthias Bethge (2015) A Neural Algorithm of Artistic Style. ArXiv:1508.06576 [Cs, q-Bio].
- HeWH16: Kun He, Yan Wang, John Hopcroft (2016) A Powerful Generative Model Using Random Weights for the Deep Image Representation. In Advances in Neural Information Processing Systems.
- Giro01: Mark Girolami (2001) A Variational Method for Learning Sparse and Overcomplete Representations. Neural Computation, 13(11), 2517–2532. DOI
- BoCh12: Charles Bordenave, Djalil Chafaï (2012) Around the circular law. Probability Surveys, 9(0), 1–89. DOI
- Wyse17: L. Wyse (2017) Audio Spectrogram Representations for Processing with Convolutional Neural Networks. In Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [cs.NE]).
- LSLW15: Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther (2015) Autoencoding beyond pixels using a learned similarity metric. ArXiv:1512.09300 [Cs, Stat].
- KiWe14: Diederik P. Kingma, Max Welling (2014) Auto-Encoding Variational Bayes. In ICLR 2014 conference.
- SLZC16: Zheng Sun, Jiaqi Liu, Zewang Zhang, Jingwen Chen, Zhao Huo, Ching Hua Lee, Xiao Zhang (2016) Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding. ArXiv:1611.05416 [Cs].
- OKVE16: Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu (2016) Conditional Image Generation with PixelCNN Decoders. ArXiv:1606.05328 [Cs].
- BoLe06: Oliver Bown, Sebastian Lexer (2006) Continuous-Time Recurrent Neural Networks for Generative and Interactive Musical Performance. In Applications of Evolutionary Computing (pp. 652–663). Springer Berlin Heidelberg
- LCHR16: Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani (2016) Deep Clustering and Conventional Networks for Music Separation: Stronger Together. ArXiv:1611.06265 [Cs, Stat].
- DCSF15: Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus (2015) Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. ArXiv:1506.05751 [Cs].
- YuDe11: D. Yu, L. Deng (2011) Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP]. IEEE Signal Processing Magazine, 28(1), 145–154. DOI
- LWLH17: Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou (2017) Demystifying Neural Style Transfer. In IJCAI.
- DiSB16: Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio (2016) Density estimation using Real NVP. In arXiv:1605.08803 [cs, stat].
- HaCL06: R. Hadsell, S. Chopra, Y. LeCun (2006) Dimensionality Reduction by Learning an Invariant Mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 1735–1742). DOI
- MTTR16: Eric Malmi, Pyry Takala, Hannu Toivonen, Tapani Raiko, Aristides Gionis (2016) DopeLearning: A Computational Approach to Rap Lyrics Generation. ArXiv:1505.04771 [Cs], 195–204. DOI
- GDGR15: Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, Daan Wierstra (2015) DRAW: A Recurrent Neural Network For Image Generation. ArXiv:1502.04623 [Cs].
- SmLe06: Evan C. Smith, Michael S. Lewicki (2006) Efficient auditory coding. Nature, 439(7079), 978–982. DOI
- GrLe11: Karol Gregor, Yann LeCun (2011) Efficient Learning of Sparse Invariant Representations. ArXiv:1105.5307 [Cs].
- DiSc14: Sander Dieleman, Benjamin Schrauwen (2014) End to end learning for music audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6964–6968). IEEE DOI
- GoSS14: Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy (2014) Explaining and Harnessing Adversarial Examples. ArXiv:1412.6572 [Cs, Stat].
- GSFT12: Roger Grosse, Ruslan R. Salakhutdinov, William T. Freeman, Joshua B. Tenenbaum (2012) Exploiting compositionality to explore a large space of model structures. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
- OlMS17: Chris Olah, Alexander Mordvintsev, Ludwig Schubert (2017) Feature Visualization. Distill, 2(11), e7. DOI
- ZuCa17: Zack Zukowski, Cj Carr (2017) Generating Black Metal and Math Rock: Beyond Bach, Beethoven, and Beatles. In 31st Conference on Neural Information Processing Systems (NIPS 2017).
- GPMX14: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, … Yoshua Bengio (2014) Generative Adversarial Networks. ArXiv:1406.2661 [Cs, Stat].
- ThBe15: Lucas Theis, Matthias Bethge (2015) Generative Image Modeling Using Spatial LSTMs. ArXiv:1506.03478 [Cs, Stat].
- ZKSE16: Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros (2016) Generative Visual Manipulation on the Natural Image Manifold. In Proceedings of European Conference on Computer Vision.
- UlVL17: Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky (2017) Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis. ArXiv:1701.02096 [Cs].
- KSJC16: Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling (2016) Improving Variational Inference with Inverse Autoregressive Flow. In Advances in Neural Information Processing Systems 29. Curran Associates, Inc.
- UlVL16: Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky (2016) Instance Normalization: The Missing Ingredient for Fast Stylization. ArXiv:1607.08022 [Cs].
- GrLe10: Karol Gregor, Yann LeCun (2010) Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 399–406).
- Hint07: Geoffrey E. Hinton (2007) Learning multiple layers of representation. Trends in Cognitive Sciences, 11(10), 428–434. DOI
- DSTB14: Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox (2014) Learning to Generate Chairs, Tables and Cars with Convolutional Networks. ArXiv:1411.5928 [Cs].
- SBMC18: Bob L. Sturm, Oded Ben-Tal, Úna Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, … François Pachet (2018) Machine learning research that matters for music creation: A case study. Journal of New Music Research, 0(0), 1–20. DOI
- BoBV12: Nicolas Boulanger-Lewandowski, Yoshua Bengio, Pascal Vincent (2012) Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription. In 29th International Conference on Machine Learning.
- Wald16a: Christian Walder (2016a) Modelling Symbolic Music: Beyond the Piano Roll. ArXiv:1606.01368 [Cs].
- SaCa14: Andy M. Sarroff, Michael Casey (2014) Musical audio synthesis using autoencoding neural nets.. Ann Arbor, MI: Michigan Publishing, University of Michigan Library
- JYFY17: Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Mingli Song (2017) Neural Style Transfer: A Review. ArXiv:1705.04058 [Cs].
- MnGr14: Andriy Mnih, Karol Gregor (2014) Neural Variational Inference and Learning in Belief Networks. In Proceedings of The 31st International Conference on Machine Learning.
- JoAF16: Justin Johnson, Alexandre Alahi, Li Fei-Fei (2016) Perceptual Losses for Real-Time Style Transfer and Super-Resolution. ArXiv:1603.08155 [Cs].
- NePL16: Daniel Neil, Michael Pfeiffer, Shih-Chii Liu (2016) Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences. In Advances in Neural Information Processing Systems 29 (pp. 3882–3890). Curran Associates, Inc.
- OoKK16: Aäron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu (2016) Pixel Recurrent Neural Networks. ArXiv:1601.06759 [Cs].
- KALL17: Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen (2017) Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proceedings of ICLR.
- HiSa06: Geoffrey E. Hinton, Ruslan R. Salakhutdinov (2006) Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507. DOI
- Cham16: Alex J. Champandard (2016) Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks. ArXiv:1603.01768 [Cs].
- Wald16b: Christian Walder (2016b) Symbolic Music Data Version 10. ArXiv:1606.02542 [Cs].
- JeBV16: Nikolay Jetchev, Urs Bergmann, Roland Vollgraf (2016) Texture Synthesis with Spatial Generative Adversarial Networks. In Advances in Neural Information Processing Systems 29.
- Mita17: Parag K. Mital (2017) Time Domain Neural Audio Style Transfer. ArXiv:1711.11160 [Cs].
- YuVa17: Haizi Yu, Lav R. Varshney (2017) Towards deep interpretability (MUS-ROVER II): learning hierarchical representations of tonal music. In Proceedings of International Conference on Learning Representations (ICLR) 2017.
- LNBB15: Angeliki Lazaridou, Dat Tien Nguyen, Raffaella Bernardi, Marco Baroni (2015) Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation. ArXiv:1506.03500 [Cs].
- WiBi05: John M. Winn, Christopher M. Bishop (2005) Variational message passing. In Journal of Machine Learning Research (pp. 661–694).
- Oord16: Aäron van den Oord (2016) Wavenet: A Generative Model for Raw Audio
- WSHL15: Qi Wu, Chunhua Shen, Anton van den Hengel, Lingqiao Liu, Anthony Dick (2015) What value high level concepts in vision to language problems? ArXiv:1506.01144 [Cs].