A C++/Python neural network toolkit by Google. I am using it for solving general machine-learning problems, and frequently enough that I need notes.
The construction of graphs is more explicit than in Theano, so I find it easier to understand, although this means that you lose the near-python syntax of Theano.
Tensorflow also claims to compile to smartphones etc, although that looks buggy ATM.
- Keras supports tensorflow and Theano as a backend, for comfort and convenience. See below for some notes.
- tensorflowslim eases some boring bits.
- tflearn wraps the tensorflow machine in scikit-learn (Although the implementation is not very enlightening, nor the syntax especially clear.)
Getting data in
This is a depressingly complex topic; Likely it’s more lines of code than building your actual learning algorithm.
For example, things break differently if
- you are inputting data of variable dimensions via python (which requires a “feed”, which requires keeping references to a placeholder Op around, and ALWAYS resubmitting the data every time you run an op, even if the data is not required for the current Op), or
- Or inputting a Variable (which may also be feeds, just to mess with you, and claim to also be variable dimensions but that never works for me) via C++.
These interact in various different ways that seem irritating, but are probably to do with enabling very large scale data reading workflows, so that you might accidentally solve a problem for Google and they can get your solution for cheap.
My experience that that stuff is so horribly messy that you should just build different graphs for the estimation and deployment phases of your mode and implement them each according to convenience.
I’m not yet sure how to easily transmit the estimated parameters between graphs in these two separate phases… I’ll make notes about THAT when i come to it.
The documentation for these is abysmal.
To write: How to create standard linear filters in Tensorflow.
The Tensorflow RNN documentation, as bad as it is, is not even easy to find, being scattered across several non-obvious locations without consistent crosslinks.
To make it actually make sense without unwarranted time wasting and guessing, you will then need to read other stuff:
- seq2seq models with GRUs : Fun with Recurrent Neural Nets.
- Variable sequence length HOWTO.
- Where do the RNN weights come from? Magic.
- Stateful LSTM in Keras.
- Ben Bolte: Deep Language Modeling for Question Answering using Keras
Denny Britz’s blog posts
- RNNs in Tensorflow, a practical guide and undocumented features.
- He also gives a good explanation of vanishing gradients.
You probably want to start here unless your needs are extraordinarily esoteric, since it removes a lot of boilerplate, and make even writing new boilerplate easier.
Getting models out
- For a local app: Hamed MP, Exporting trained TensorFlow models to C++ the RIGHT way!
- For serving it online, Tensorflow serving is the preferred means. See the Serving documentation.
Doing it in the cloud because you don’t have NVIDIA sponsorship
See practical cloud computing, which has a couple of sections on that.