The Living Thing / Notebooks :

Here’s how I would do art with machine learning if I had to

Usefulness: 🔧
Novelty: 💡
Uncertainty: 🤪 🤪 🤪
Incompleteness: 🚧 🚧 🚧

I’ve a weakness for ideas that give me plausible deniability for making generative art while doing my maths homework.

Quasimondo: so do you

This page is more chaotic than the already-chaotic median, sorry. Good luck making sense of it.

See also analysis/resynthesis.

See gesture recognition. Oh and also google’s AMI channel, and ml4artists, which has some sweet machine learning for artists topic guides.

Many neural networks, are generative in the sense that even if you train ’em to classify things, they can also predict new members of the class. e.g. run the model forwards, it recognizes melodies; run it “backwards”, it composes melodies. Or rather, you maybe trained them to generate examples in the course of training them to detect examples.

There are many definitional and practical wrinkles here, and this quality is not unique to artificial neural networks, but it is a great convenience, and the gods of machine learning have blessed us with much infrastructure to exploit this feature, because it is very close to actual profitable algorithms. Upshot: There is now a lot of computation and grad student labour directed at producing neural networks which as a byproduct can produce faces, chairs, film dialogue, symphonies and so on.

There are NIPS streams about this now.

Misc

Some as-yet-unfiled neural-artwork links I should think about.

Variational inference (Hint07, WiBi05, Giro01, MnGr14) looks exciting here, particularly in an autoencoder setting. (KiWe13)

Text synthesis

Visual synthesis

See those classic images from google’s tripped-out image recognition systems) or Gatys, Ecker and Bethge’s deep art Neural networks do a passable undergraduate Monet.

Here’s Frank Liu’s implementation of style transfer in pycaffe.

Alex Graves, Generating Sequences With Recurrent Neural Networks, generates handwriting. Relatedly, sketch-rnn is reaaaally cute.

Deep dreaming approaches are entertaining. (NSFW) Here’s a more pedestrian and slightly more informative version of that.

Distill.pub has some lovely visual explanations of visual and other neural networks:

Music

Symbolic composition via scores/MIDI/etc

Seems like it should be easy, until you think about it.

Related: Arpeggiate by numbers which discussed music-theory.

Google has weighed in, like a gorilla on the metallophone, to do midi composition with Tensorflow as part of their Magenta project. Their NIPS 2016 demo won the best demo prize.

Daniel Johnson has a convolutional and recurrent architecture for taking into account multiple types of dependency in music, which he calls biaxial neural network Zhe LI, Composing Music With Recurrent Neural Networks.

Ji-Sung Kim’s deepjazz project is minimal, but does interesting jazz improvisations. Part of the genius here is choosing totally chaotic music to try to ape, so you can ape it chaotically. (Code)

Boulanger-Lewandowski, (code and data) for BoBV12’s recurrent neural network composition using python/Theano. Christian Walder leads a project which shares some roots with that. (Wald16a, Wald16b) Bob Sturm’s FolkRNN does a related thing, but ingeniously redefines the problem by focussing on folk tune notation.

A tutorial on generating music using Restricted Boltzmann Machines for the conditional random field density, and an RNN for the time dependence after BoBV12.

Bob Sturm did a good one

🚧 google’s latest demo in this area was popular. Deep Bach (paper HaPa16, code) seems to be doing a related thing. Similar sets of authors (HaSP16) have some other related work):

Modeling polyphonic music is a particularly challenging task because of the intricate interplay between melody and harmony. A good model should satisfy three requirements: statistical accuracy (capturing faithfully the statistics of correlations at various ranges, horizontally and vertically), flexibility (coping with arbitrary user constraints), and generalization capacity (inventing new material, while staying in the style of the training corpus). Models proposed so far fail on at least one of these requirements. We propose a statistical model of polyphonic music, based on the maximum entropy principle. This model is able to learn and reproduce pairwise statistics between neighboring note events in a given corpus. The model is also able to invent new chords and to harmonize unknown melodies. We evaluate the invention capacity of the model by assessing the amount of cited, re-discovered, and invented chords on a corpus of Bach chorales. We discuss how the model enables the user to specify and enforce user-defined constraints, which makes it useful for style-based, interactive music generation.

Audio synthesis

See analysis/resynthesis, voice face.

Refs

Bordenave, Charles, and Djalil Chafaï. 2012. “Around the Circular Law.” Probability Surveys 9 (0): 1–89. https://doi.org/10.1214/11-PS183.

Boulanger-Lewandowski, Nicolas, Yoshua Bengio, and Pascal Vincent. 2012. “Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription.” In 29th International Conference on Machine Learning. http://arxiv.org/abs/1206.6392.

Bown, Oliver, and Sebastian Lexer. 2006. “Continuous-Time Recurrent Neural Networks for Generative and Interactive Musical Performance.” In Applications of Evolutionary Computing, edited by Franz Rothlauf, Jürgen Branke, Stefano Cagnoni, Ernesto Costa, Carlos Cotta, Rolf Drechsler, Evelyne Lutton, et al., 652–63. Lecture Notes in Computer Science 3907. Springer Berlin Heidelberg. http://link.springer.com/chapter/10.1007/11732242_62.

Champandard, Alex J. 2016. “Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks,” March. http://arxiv.org/abs/1603.01768.

Denton, Emily, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. “Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks,” June. http://arxiv.org/abs/1506.05751.

Dieleman, Sander, and Benjamin Schrauwen. 2014. “End to End Learning for Music Audio.” In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6964–8. IEEE. https://doi.org/10.1109/ICASSP.2014.6854950.

Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio. 2016. “Density Estimation Using Real NVP.” In. http://arxiv.org/abs/1605.08803.

Dosovitskiy, Alexey, Jost Tobias Springenberg, Maxim Tatarchenko, and Thomas Brox. 2014. “Learning to Generate Chairs, Tables and Cars with Convolutional Networks,” November. http://arxiv.org/abs/1411.5928.

Dumoulin, Vincent, Jonathon Shlens, and Manjunath Kudlur. 2016. “A Learned Representation for Artistic Style,” October. http://arxiv.org/abs/1610.07629.

Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. 2015. “A Neural Algorithm of Artistic Style,” August. http://arxiv.org/abs/1508.06576.

Girolami, Mark. 2001. “A Variational Method for Learning Sparse and Overcomplete Representations.” Neural Computation 13 (11): 2517–32. https://doi.org/10.1162/089976601753196003.

Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. 2014. “Explaining and Harnessing Adversarial Examples,” December. http://arxiv.org/abs/1412.6572.

Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. “Generative Adversarial Nets.” In Advances in Neural Information Processing Systems 27, edited by Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, 2672–80. NIPS’14. Cambridge, MA, USA: Curran Associates, Inc. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.

Gregor, Karol, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. “DRAW: A Recurrent Neural Network for Image Generation,” February. http://arxiv.org/abs/1502.04623.

Gregor, Karol, and Yann LeCun. 2010. “Learning Fast Approximations of Sparse Coding.” In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 399–406. http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_GregorL10.pdf.

———. 2011. “Efficient Learning of Sparse Invariant Representations,” May. http://arxiv.org/abs/1105.5307.

Grosse, Roger, Ruslan R. Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. 2012. “Exploiting Compositionality to Explore a Large Space of Model Structures.” In Proceedings of the Conference on Uncertainty in Artificial Intelligence. http://arxiv.org/abs/1210.4856.

Hadsell, R., S. Chopra, and Y. LeCun. 2006. “Dimensionality Reduction by Learning an Invariant Mapping.” In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2:1735–42. https://doi.org/10.1109/CVPR.2006.100.

He, Kun, Yan Wang, and John Hopcroft. 2016. “A Powerful Generative Model Using Random Weights for the Deep Image Representation.” In Advances in Neural Information Processing Systems. http://arxiv.org/abs/1606.04801.

Hinton, Geoffrey E. 2007. “Learning Multiple Layers of Representation.” Trends in Cognitive Sciences 11 (10): 428–34. https://doi.org/10.1016/j.tics.2007.09.004.

Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. 2006. “Reducing the Dimensionality of Data with Neural Networks.” Science 313 (5786): 504–7. https://doi.org/10.1126/science.1127647.

Jetchev, Nikolay, Urs Bergmann, and Roland Vollgraf. 2016. “Texture Synthesis with Spatial Generative Adversarial Networks.” In Advances in Neural Information Processing Systems 29. http://arxiv.org/abs/1611.08207.

Jing, Yongcheng, Yezhou Yang, Zunlei Feng, Jingwen Ye, and Mingli Song. 2017. “Neural Style Transfer: A Review,” May. http://arxiv.org/abs/1705.04058.

Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. 2016. “Perceptual Losses for Real-Time Style Transfer and Super-Resolution,” March. http://arxiv.org/abs/1603.08155.

Karras, Tero, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. “Progressive Growing of GANs for Improved Quality, Stability, and Variation.” In Proceedings of ICLR. http://arxiv.org/abs/1710.10196.

Karras, Tero, Samuli Laine, and Timo Aila. 2018. “A Style-Based Generator Architecture for Generative Adversarial Networks,” December. http://arxiv.org/abs/1812.04948.

Kingma, Diederik P., Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. “Improving Variational Inference with Inverse Autoregressive Flow.” In Advances in Neural Information Processing Systems 29. Curran Associates, Inc. http://arxiv.org/abs/1606.04934.

Kingma, Diederik P., and Max Welling. 2014. “Auto-Encoding Variational Bayes.” In ICLR 2014 Conference. http://arxiv.org/abs/1312.6114.

Larsen, Anders Boesen Lindbo, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther. 2015. “Autoencoding Beyond Pixels Using a Learned Similarity Metric,” December. http://arxiv.org/abs/1512.09300.

Lazaridou, Angeliki, Dat Tien Nguyen, Raffaella Bernardi, and Marco Baroni. 2015. “Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation,” June. http://arxiv.org/abs/1506.03500.

Li, Yanghao, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. “Demystifying Neural Style Transfer.” In IJCAI. http://arxiv.org/abs/1701.01036.

Luo, Yi, Zhuo Chen, John R. Hershey, Jonathan Le Roux, and Nima Mesgarani. 2016. “Deep Clustering and Conventional Networks for Music Separation: Stronger Together,” November. http://arxiv.org/abs/1611.06265.

Malmi, Eric, Pyry Takala, Hannu Toivonen, Tapani Raiko, and Aristides Gionis. 2016. “DopeLearning: A Computational Approach to Rap Lyrics Generation,” 195–204. https://doi.org/10.1145/2939672.2939679.

Mital, Parag K. 2017. “Time Domain Neural Audio Style Transfer,” November. http://arxiv.org/abs/1711.11160.

Mnih, Andriy, and Karol Gregor. 2014. “Neural Variational Inference and Learning in Belief Networks.” In Proceedings of the 31st International Conference on Machine Learning. http://www.jmlr.org/proceedings/papers/v32/mnih14.html.

Neil, Daniel, Michael Pfeiffer, and Shih-Chii Liu. 2016. “Phased LSTM: Accelerating Recurrent Network Training for Long or Event-Based Sequences.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 3882–90. Curran Associates, Inc. http://papers.nips.cc/paper/6310-phased-lstm-accelerating-recurrent-network-training-for-long-or-event-based-sequences.pdf.

Olah, Chris, Alexander Mordvintsev, and Ludwig Schubert. 2017. “Feature Visualization.” Distill 2 (11): e7. https://doi.org/10.23915/distill.00007.

Oord, Aäron van den. 2016. “Wavenet: A Generative Model for Raw Audio.”

Oord, Aäron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. “Pixel Recurrent Neural Networks,” January. http://arxiv.org/abs/1601.06759.

Oord, Aäron van den, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. “Conditional Image Generation with PixelCNN Decoders,” June. http://arxiv.org/abs/1606.05328.

Sarroff, Andy M., and Michael Casey. 2014. “Musical Audio Synthesis Using Autoencoding Neural Nets.” In. Ann Arbor, MI: Michigan Publishing, University of Michigan Library. http://www.smc-conference.org/smc-icmc-2014/papers/images/VOL_2/1411.pdf.

Sigtia, Siddharth, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d’Avila Garcez, and Simon Dixon. 2015. “A Hybrid Recurrent Neural Network for Music Transcription.” In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2061–5. IEEE. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7178333.

Smith, Evan C., and Michael S. Lewicki. 2006. “Efficient Auditory Coding.” Nature 439 (7079): 978–82. https://doi.org/10.1038/nature04485.

Sturm, Bob L., Oded Ben-Tal, Úna Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, Gaëtan Hadjeres, Emmanuel Deruty, and François Pachet. 2018. “Machine Learning Research That Matters for Music Creation: A Case Study.” Journal of New Music Research 0 (0): 1–20. https://doi.org/10.1080/09298215.2018.1515233.

Sun, Zheng, Jiaqi Liu, Zewang Zhang, Jingwen Chen, Zhao Huo, Ching Hua Lee, and Xiao Zhang. 2016. “Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding,” November. http://arxiv.org/abs/1611.05416.

Theis, Lucas, and Matthias Bethge. 2015. “Generative Image Modeling Using Spatial LSTMs,” June. http://arxiv.org/abs/1506.03478.

Ulyanov, Dmitry, Andrea Vedaldi, and Victor Lempitsky. 2016. “Instance Normalization: The Missing Ingredient for Fast Stylization,” July. http://arxiv.org/abs/1607.08022.

———. 2017. “Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis,” January. http://arxiv.org/abs/1701.02096.

Walder, Christian. 2016a. “Modelling Symbolic Music: Beyond the Piano Roll,” June. http://arxiv.org/abs/1606.01368.

———. 2016b. “Symbolic Music Data Version 1.0,” June. http://arxiv.org/abs/1606.02542.

Winn, John M., and Christopher M. Bishop. 2005. “Variational Message Passing.” In Journal of Machine Learning Research, 661–94. http://johnwinn.org/Publications/papers/VMP2005.pdf.

Wu, Qi, Chunhua Shen, Anton van den Hengel, Lingqiao Liu, and Anthony Dick. 2015. “What Value High Level Concepts in Vision to Language Problems?” June. http://arxiv.org/abs/1506.01144.

Wyse, L. 2017. “Audio Spectrogram Representations for Processing with Convolutional Neural Networks.” In Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [cs.NE]). http://arxiv.org/abs/1706.09559.

Yu, D., and L. Deng. 2011. “Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP].” IEEE Signal Processing Magazine 28 (1): 145–54. https://doi.org/10.1109/MSP.2010.939038.

Yu, Haizi, and Lav R. Varshney. 2017. “Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music.” In Proceedings of International Conference on Learning Representations (ICLR) 2017.

Zhu, Jun-Yan, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. 2016. “Generative Visual Manipulation on the Natural Image Manifold.” In Proceedings of European Conference on Computer Vision. http://arxiv.org/abs/1609.03552.

Zukowski, Zack, and Cj Carr. 2017. “Generating Black Metal and Math Rock: Beyond Bach, Beethoven, and Beatles.” In 31st Conference on Neural Information Processing Systems (NIPS 2017).