The Living Thing / Notebooks :

Getting a laptop in fit shape for deep learning with Ubuntu

How is deep learning awful this time?

I want to do machine learning without the cloud, which as we learn previously, is awful.

But also I'm a vagabond with nowhere safe and high-bandwidth to store a giant GPU machine (campus IT don't return my calls about it; I think they think I'm taking the piss.)

So, let's buy a Razer Blade 2016, a portable, surprisingly cheap laptop with all the latest feature and a comparable performance to the kind of single-GPU desktop machine I could afford.

I don't want to do anything fancy here, just process a few gigabytes of MP3 data. My data is stored in the AARNET owncloud server. It's all quite simple, but the algorithm is just too slow without a GPU and I don't have a GPU machine I can leave running.

Miscellaneous Razerblade steps

See comfy razer.

The CUDA Bit

Installing CUDA etc on the laptop is straightforward. Making it run is not.

Get deps, by downloading the deb then installing.

sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo apt-key adv --fetch-keys
sudo apt-get update
sudo apt-get install cuda-9-1

NVIDIA's howto points out that you will need to care about Bumblebee to survive Ubuntu.

Confusing, because most sources seem to think you want to have NVIDIA graphics; but what if you merely want NVIDIA computing via CUDA, and don't care about graphics?

In stark contrast to NVIDIA, AskUbuntu claims:

There is sometimes confusion about CUDA. You don't need Bumblebee to run CUDA. Follow the How-to to get CUDA working under Ubuntu.

Lies! without Bumblebee the NVIDIA either gets switched off, or you get various other applications claiming GPU memory and fighting with tensorflow, causing strange crashes in the middle of complicated modelling jobs. Tensorflow wants the whole GPU. You need bumblebee in order to selectively allow GPU access to bad GPU citizens like Tensorflow.

The second Askubuntu point is interesting though

There is however a new feature (--no-xorg option for optirun) in Bumblebee 3.2, which makes it possible to run CUDA / OpenCL applications that does not [sic] need the graphics rendering capabilities.

should try that.

Bumblebee is its own small world. There is a walkthrough for a Razer Blade. It has a debugging page, which you will need. Daniel Teichmann's Bumblebee instructions.

sudo apt install nvidia-prime
sudo prime-select intel

Install bumblebee via the ppa (NB the stable version is too old. as of 2018-03)

sudo add-apt-repository ppa:bumblebee/testing
sudo apt-get update
sudo modprobe nvidia-uvm nvidia
primusrun python -i jobs/
primusrun nvidia-smi
CUDA_VISIBLE_DEVICES= primusrun python jobs/

Tensorflow ACPI interaction.

Turn Off Discrete nVidia Optimus Graphics Card in Ubuntu.

Webupd8 bumblebee howto, or one with fewer manual steps from PC suggest

Monitoring NVIDIA power without using the NVIDIA

Boot without graphics in Ubuntu.

Things change in Ubuntu 18.04 Bionic Beaver.

The master bug thread tracking the configuration changes is here. There is a scruffier thread on the Bumblebee repo. AFAICT this is partially automated in the bumlebee-nvidia package.

Now you need to build tensorflow to benefit from this

So you need to install bazel and a whole bunch of java. This will also be useful if you wish to do discount imitations of other google infrastructure..

bazel install:

sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python  openjdk-8-jdk
echo "deb [arch=amd64] stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel

Pinning kernel version because the NVIDIA drivers don't always seem to exist

This occurred a couple of times for me; I think compatible versions of something didn't match with my manually selected versions of some other thing i don't care why would anyone care about this? Sidestepping the issue by pinning grub default to boot the good kernel seemed to work, although i haven't needed to recently, for reasons i will make no attempt to discover.

Avoiding the fiddly bits with anaconda

Maybe this will work:

conda create -n tensorflow pip python=3.6
source activate tensorflow
conda install -c anaconda tensorflow-gpu

Other comforts

See comfy ubuntu.