The Living Thing / Notebooks :

Here’s how I would do art with machine learning if I had to.

I’ve a weakness for ideas that give me plausible deniability for making generative art while doing my maths homework.

So do you

This page is WAAAAAY more chaotic than the median, sorry. Good luck making sense of it.

See also analysis/resynthesis.

Machine learning generally

See also gesture recognition. See google’s AMI channel

Neural networks in particular

Many neural networks, especially the backprop ones, are generative in the sense that even if you train ‘em to classify things, they can also predict new members of the class. e.g. run the model forwards, it recognizes melodies; run it “backwards”, it composes melodies. Or rather, you maybe trained them to generate examples in the course of training them.

There are many definitional and practical wrinkles here, and this quality is not unique to artificial neural networks, but it is a great convenience, and the gods of machine learning have blessed us with much infrastructure to exploit this feature, because it is very close to actual profitable algorithms. Upshot: There is now a lot of computation and grad student labour directed at producing neural networks which as a byproduct can produce faces, chairs, film dialogue, symphonies and so on.

Misc

Some as-yet-unfiled neural-artwork links I should think about.

  • IGAN, iGAN: Interactive Image Generation via Generative Adversarial Networks
  • interpolating style transfer.
  • neurogram is a cute semi—untrained neural network image synthesis-in-the-browser project
  • Adversarial generation is a cool hack if you hate boring stuff like labelling data sets e.g. chair generation
  • Autoencoding beyond pixels using a learned similarity metric (LSLW15) code The clever hack here is the “generative adversarial networks”

Variational inference (Hint07, WiBi05, Giro01, MnGr14) looks exciting here, particularly in an autoencoder setting. (KiWe13)

Text synthesis

Visual synthesis

@bhautikj style transfer “Drumpf” @bhautikj style transfer “Drumpf”

See those classic images from google’s tripped-out image recognition systems) or Gatys, Ecker and Bethge’s deep art Neural networks do a passable undergraduate Monet.

Here’s Frank Liu’s implementation of style transfer in pycaffe.

Alex Graves, Generating Sequences With Recurrent Neural Networks, generates handwriting. Relatedly, sketch-rnn is reaaaally cute

Deep dreaming approaches are entertaining.

Distill.pub has some lovely visual explanations of visual and other neural networks:

  • Experiments in Handwriting with a Neural Network
  • Deconvolution and Checkerboard Artifacts
  • How to Use t-SNE Effectively
  • Attention and Augmented Recurrent Neural Networks
  • hardmaru presents an amazing introduction to running sophisticated neural networks in the browser, targeted at artists, which goes over the handwriting post in a non-technical way.

Composing music

Seems like it should be easy, until you think about it.

Related: Arpeggiate by numbers.

Google has weighed in like a gorilla on the metallophone to do midi composition with Tensorflow as part of their Magenta project. Their NIPS 2016 demo won the best demo prize.

Daniel Johnson has a convolutional and recurrent architecture for taking into account multiple types of dependency in music, which he calls biaxial neural network Zhe LI, Composing Music With Recurrent Neural Networks.

Ji-Sung Kim’s deepjazz project is minimal, but does interesting jazz improvisations. Part of the genius here is choosing totally chaotic music to try to ape, so you can ape it chaotically. (Code)

Boulanger-Lewandowski: code and data for BoBV12’s recurrent neural network composition. using python/Theano. Christian Walder leads a project which shares some roots with that (Wald16a, Wald16b)

A tutorial on generating music using Restricted Boltzmann Machines for the conditional random field density, and an RNN for the time dependence after BoBV12.

Bob Sturm did a nice one

TBD: google’s latest demo in this area was popular.

Audio synthesis

See also analysis/resynthesis.

Matt Vitelli on music generation from MP3s (source)

Soundtracking audio from video.

Alex Graves on RNN predictive synthesis.

Andy Sarrof, Musical Audio Synthesis Using Autoencoding Neural Nets. (code)

Style transfer for audio is crying out to be done, but I’ve only seen more traditional techniques.

@bhautikj style transfer experiment “Drumpf”

Style transfer will be familiar to anyone who has ever taken hallucinogens or watched movies made by those who have, but you can’t usually put hallucinogens or film nights on the departmental budget so we have to make do with gigantic computing clusters.

Refs

BoBV12
Boulanger-Lewandowski, N., Bengio, Y., & Vincent, P. (2012) Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription. In 29th International Conference on Machine Learning.
BoLe06
Bown, O., & Lexer, S. (2006) Continuous-Time Recurrent Neural Networks for Generative and Interactive Musical Performance. In F. Rothlauf, J. Branke, S. Cagnoni, E. Costa, C. Cotta, R. Drechsler, … H. Takagi (Eds.), Applications of Evolutionary Computing (pp. 652–663). Springer Berlin Heidelberg
Cham16
Champandard, A. J.(2016) Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks. arXiv:1603.01768 [Cs].
DCSF15
Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015) Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. arXiv:1506.05751 [Cs].
DiSc14
Dieleman, S., & Schrauwen, B. (2014) End-to-end learning for music audio. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6964–6968). IEEE DOI.
DiSB16
Dinh, L., Sohl-Dickstein, J., & Bengio, S. (2016) Density estimation using Real NVP. In arXiv:1605.08803 [cs, stat].
DSTB14
Dosovitskiy, A., Springenberg, J. T., Tatarchenko, M., & Brox, T. (2014) Learning to Generate Chairs, Tables and Cars with Convolutional Networks. arXiv:1411.5928 [Cs].
DuSK16
Dumoulin, V., Shlens, J., & Kudlur, M. (2016) A Learned Representation For Artistic Style. arXiv:1610.07629 [Cs].
GaEB15
Gatys, L. A., Ecker, A. S., & Bethge, M. (2015) A Neural Algorithm of Artistic Style. arXiv:1508.06576 [Cs, Q-Bio].
Giro01
Girolami, M. (2001) A Variational Method for Learning Sparse and Overcomplete Representations. Neural Computation, 13(11), 2517–2532. DOI.
GPMX14
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … Bengio, Y. (2014) Generative Adversarial Networks. arXiv:1406.2661 [Cs, Stat].
GoSS14
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014) Explaining and Harnessing Adversarial Examples. arXiv:1412.6572 [Cs, Stat].
GDGR15
Gregor, K., Danihelka, I., Graves, A., Rezende, D. J., & Wierstra, D. (2015) DRAW: A Recurrent Neural Network For Image Generation. arXiv:1502.04623 [Cs].
GrLe10
Gregor, K., & LeCun, Y. (2010) Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 399–406).
GrLe11
Gregor, K., & LeCun, Y. (2011) Efficient Learning of Sparse Invariant Representations. arXiv:1105.5307 [Cs].
GSFT12
Grosse, R., Salakhutdinov, R. R., Freeman, W. T., & Tenenbaum, J. B.(2012) Exploiting compositionality to explore a large space of model structures. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
HaCL06
Hadsell, R., Chopra, S., & LeCun, Y. (2006) Dimensionality Reduction by Learning an Invariant Mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 1735–1742). DOI.
HeWH16
He, K., Wang, Y., & Hopcroft, J. (2016) A Powerful Generative Model Using Random Weights for the Deep Image Representation. arXiv:1606.04801 [Cs].
Hint07
Hinton, G. E.(2007) Learning multiple layers of representation. Trends in Cognitive Sciences, 11(10), 428–434. DOI.
HiSa06
Hinton, G. E., & Salakhutdinov, R. R.(2006) Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507. DOI.
JeBV16
Jetchev, N., Bergmann, U., & Vollgraf, R. (2016) Texture Synthesis with Spatial Generative Adversarial Networks. In Advances in Neural Information Processing Systems 29.
JoAF16
Johnson, J., Alahi, A., & Fei-Fei, L. (2016) Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv:1603.08155 [Cs].
KaDG15
Kalchbrenner, N., Danihelka, I., & Graves, A. (2015) Grid Long Short-Term Memory. arXiv:1507.01526 [Cs].
KiSW16
Kingma, D. P., Salimans, T., & Welling, M. (2016) Improving Variational Inference with Inverse Autoregressive Flow. arXiv:1606.04934 [Cs, Stat].
KiWe13
Kingma, D. P., & Welling, M. (2013) Auto-Encoding Variational Bayes. arXiv:1312.6114 [Cs, Stat].
LSLW15
Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2015) Autoencoding beyond pixels using a learned similarity metric. arXiv:1512.09300 [Cs, Stat].
LNBB15
Lazaridou, A., Nguyen, D. T., Bernardi, R., & Baroni, M. (2015) Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation. arXiv:1506.03500 [Cs].
LWLH17
Li, Y., Wang, N., Liu, J., & Hou, X. (2017) Demystifying Neural Style Transfer. arXiv:1701.01036 [Cs].
LCHR16
Luo, Y., Chen, Z., Hershey, J. R., Roux, J. L., & Mesgarani, N. (2016) Deep Clustering and Conventional Networks for Music Separation: Stronger Together. arXiv:1611.06265 [Cs, Stat].
MTTR16
Malmi, E., Takala, P., Toivonen, H., Raiko, T., & Gionis, A. (2016) DopeLearning: A Computational Approach to Rap Lyrics Generation. arXiv:1505.04771 [Cs], 195–204. DOI.
MnGr14
Mnih, A., & Gregor, K. (2014) Neural Variational Inference and Learning in Belief Networks. In Proceedings of The 31st International Conference on Machine Learning.
NePL16
Neil, D., Pfeiffer, M., & Liu, S.-C. (2016) Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 29 (pp. 3882–3890). Curran Associates, Inc.
OIMT15
Owens, A., Isola, P., McDermott, J., Torralba, A., Adelson, E. H., & Freeman, W. T.(2015) Visually Indicated Sounds. arXiv:1512.08512 [Cs].
SaCa14
Sarroff, A. M., & Casey, M. (2014) Musical audio synthesis using autoencoding neural nets. . Ann Arbor, MI: Michigan Publishing, University of Michigan Library
SBBW15
Sigtia, S., Benetos, E., Boulanger-Lewandowski, N., Weyde, T., Garcez, A. S. d’Avila, & Dixon, S. (2015) A hybrid recurrent neural network for music transcription. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2061–2065). IEEE
SmLe06
Smith, E. C., & Lewicki, M. S.(2006) Efficient auditory coding. Nature, 439(7079), 978–982. DOI.
SLZC16
Sun, Z., Liu, J., Zhang, Z., Chen, J., Huo, Z., Lee, C. H., & Zhang, X. (2016) Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding. arXiv:1611.05416 [Cs].
ThBe15
Theis, L., & Bethge, M. (2015) Generative Image Modeling Using Spatial LSTMs. arXiv:1506.03478 [Cs, Stat].
UlVL16
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016) Instance Normalization: The Missing Ingredient for Fast Stylization. arXiv:1607.08022 [Cs].
UlVL17a
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2017a) Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis. arXiv:1701.02096 [Cs].
UlVL17b
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2017b) Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis. arXiv:1701.02096 [Cs].
Oord16
van den Oord, A. (2016) Wavenet: A Generative Model for Raw Audio.
OoKK16
van den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016) Pixel Recurrent Neural Networks. arXiv:1601.06759 [Cs].
OKVE16
van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., & Kavukcuoglu, K. (2016) Conditional Image Generation with PixelCNN Decoders. arXiv:1606.05328 [Cs].
Wald16a
Walder, C. (2016a) Modelling Symbolic Music: Beyond the Piano Roll. arXiv:1606.01368 [Cs].
Wald16b
Walder, C. (2016b) Symbolic Music Data Version 10. arXiv:1606.02542 [Cs].
WiBi05
Winn, J. M., & Bishop, C. M.(2005) Variational message passing. In Journal of Machine Learning Research (pp. 661–694).
WSHL15
Wu, Q., Shen, C., Hengel, A. van den, Liu, L., & Dick, A. (2015) What value high level concepts in vision to language problems?. arXiv:1506.01144 [Cs].
YuDe11
Yu, D., & Deng, L. (2011) Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP]. IEEE Signal Processing Magazine, 28(1), 145–154. DOI.
ZKSE16
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., & Efros, A. A.(2016) Generative Visual Manipulation on the Natural Image Manifold. arXiv:1609.03552 [Cs].