The Living Thing / Notebooks :

Putting tensorflow on my laptop

How is deep learning awful this time?

I want to do machine learning without the cloud, which as we learn previously, is awful.

But also I’m a vagabond with nowhere safe and high-bandwidth to store a giant GPU machine (campus IT don’t return my calls about it; I think they think I’m taking the piss.)

So, let’s buy a Razer Blade 2016, a nice portable, surprisingly cheap laptop with all the latest feature and a comparable performance to the kind of single-GPU desktop machine I could afford.

I don’t want to do anything fancy here, just process a few gigabytes of MP3 data. My data is stored in the AARNET owncloud server. It’s all quite simple, but the algorithm is just too slow without a GPU and I don’t have a GPU machine I can leave running. I’ve developed it in keras v1.2.2, which depends on tensorflow 1.0.

So installing CUDA etc on the laptop is straightforward. Making it run is not.

Bumblebee

Switching the GPU on or off at the right times.

NVIDIA’s howto points out that you will need to care about Bumblebee to survive Ubuntu.

Confusing, because most sources seem to think you want to have NVIDIA graphics; but what if you merely want NVIDIA computing via CUDA, and don’t care about graphics?

In stark contrast to NVIDIA, they claim:

There is sometimes confusion about CUDA. You don’t need Bumblebee to run CUDA. Follow the How-to to get CUDA working under Ubuntu.

Lies! without Bumblebee you have no control which processes are using the GPU and they collide and explode. Specifically, Tensorflow wants the whole GPU.

Their second point is interesting though

There is however a new feature (--no-xorg option for optirun) in Bumblebee 3.2, which makes it possible to run CUDA / OpenCL applications that does not need the graphics rendering capabilities.

Like Tensorflow.

Bumblebee is its own small world. It has a debugging page, which you will need.

There is are various walkthroughs. What worked for me was a combination of all of them. There is also a distinction between optimus and primus, which I am doing my utmost to know nothing about. Try both, maybe one will work.

  1. older walkthrough for a Razer Blade.
  2. Daniel Teichmann’s Bumblebee instructions are fresher but more generic hardware and optimus instead of primus
  3. Webupd8 bumblebee howto is old and non-specific but authoritative
sudo apt-get install linux-source-4.10.0 build-essential  linux-headers-generic-hwe-16.04
sudo apt install bumblebee bumblebee-nvidia primus

Note that Teichmann doesn’t give a kernel version number, but I do because kernel version 4.4 is broken for Razer laptops, causing infinite loops in the power management subsystem if ever you close the laptop lid. So, 4.10 works, let’s use it. 4.8 might work, I can’t rememeber.

sudo update-alternatives --config i386-linux-gnu_gl_conf
sudo update-alternatives --config x86_64-linux-gnu_egl_conf
sudo update-alternatives --config x86_64-linux-gnu_gl_conf
sudo prime-select intel

sudo atom /etc/bumblebee/bumblebee.conf
sudo atom /etc/modprobe.d/bumblebee.conf
sudo atom /usr/lib/x86_64-linux-gnu/mesa/ld.so.conf
sudo atom /usr/lib/i386-linux-gnu/mesa/ld.so.conf
sudo atom /etc/bumblebee/xorg.conf.nvidia

test using virtualgl.

sudo modprobe nvidia-uvm nvidia
primusrun python -i jobs/spectrogram_normed.py
primusrun nvidia-smi
CUDA_VISIBLE_DEVICES= primusrun python jobs/spectrogram_normed.py

Turn Off Discrete nVidia Optimus Graphics Card in Ubuntu

Monitoring NVIDIA power without using the NVIDIA

Boot without graphics in Ubuntu.

https://dip4fish.blogspot.com/2016/03/using-tensorflow-07-from-python-on.html

Fixing kernel version because the NVIDIA drivers don’t always seem to exist

Fix grub default to boot the good kernel

Tensorflow compilation

following the usual tensorflow instructions, via bazel installation:

Using python library path: /home/dan/.virtualenvs/keras2/lib/python3.5/site-packages
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y
XLA JIT support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]
nvcc will be used as CUDA compiler
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-8.0
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the cuDNN version you want to use. [Leave empty to use system default]: 6.0.21
Please specify the location where cuDNN 6.0.21 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-8.0]: /usr/include/x86_64-linux-gnu
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 6.1
Do you wish to build TensorFlow with MPI support? [y/N]
MPI support will not be enabled for TensorFlow
........
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.

Configuration finished

Other comforts

Install atom

Keyboard lights

Install the dorky keyboard drivers and the dorky GUI.

sudo add-apt-repository ppa:openrazer/stable
sudo add-apt-repository ppa:lah7/polychromatic
sudo apt update
sudo apt install polychromatic
sudo gpasswd -a plugdev user

You can get into various kernel-module-related difficulties..