The Living Thing / Notebooks :


a programming language whose remarkable and rare feature is working like you imagine, if not how it should

A Swiss army knife of coding tools. Good matrix library, general scientific tools, statistics tools, web server, art tools, but, most usefully, interoperation with everything else - It wraps C, C++, Fortran, includes HTTP clients, parsers, API libraries, and all the other fruits of a thriving community. Fast enough, easy to debug, garbage-collected. If some bit is too slow, you compile it, otherwise, you relax. An excellent choice if you’d rather get stuff done than write code.

I typically do my stats and graphs in R, my user interface in javascript, my parallelism in java, and my linear algebra library is fortran but python is the thread that stitches this Frankensteinisch monster together.

Of course, it could be better. clojure is more elegant, scala is easier to parallelise, julia prioritises science more highly….

But in terms of using a damn-well-supported language that goes on your computer right now, and requires you to reinvent few wheels, and which is transferrable across number crunching, web development, UIs, text processing, graphics and sundry other domains, and does not require heavy licensing costs, this one is a good default choice.


ipython, the interactive python upgrade

The python-specific part of jupyter, which can also run without jupyter. Long story.

The main thing I forget here is

Pretty display of objects

ipython display protocol

Check out the Rich display protocol which allows you to render objects as arbitrary graphics.

How to use this? The display api docs explain that you should basically implement methods such as _repr_svg_.

I made a thing called latex_fragment which leverages this to display arbitrary latex inline. This is how you do it.

def _figure_data(self, format):
       fig, ax = plt.subplots()
       ax.plot(, 'o')
       data = print_figure(fig, format)
       # We MUST close the figure, otherwise IPython's display machinery
       # will pick it up and send it as output, resulting in a double display
       return data

   # Here we define the special repr methods that provide the IPython display protocol
   # Note that for the two figures, we cache the figure data once computed.

   def _repr_png_(self):
       if self._png_data is None:
           self._png_data = self._figure_data('png')
       return self._png_data

For a non-graphical non-fancy terminal, you probably simply want nice formatting of dictionaries:

from pprint import pprint, pformat
pprint(obj)  # display it
print(pformat(obj))  # get a nicely formatted representation

Reloading edited code

Sometimes it’s complicated to work out how to load some complicated dependency tree of stuff. There is an autoreload extension which in principle reloads everything that has changed.

%load_ext autoreload
%autoreload 2

If you don’t trust it do it manually. Use deepreload. You can even hack traditional reload to be deep.

import builtins
from IPython.lib import deepreload
builtins.reload = deepreload.reload

That didn’t work reliably for me. If you load them both at the same time, stuff gets weird. Don’t do that.

Also, this is tragically incompatible with snakeviz.


how to start the basic interactive debugger

Let’s say there is a line in your code that fails:


In vanilla python if you want to debug the last exception (the post-mortem debugger) you do:

import pdb;

and if you want to drop into a debugger from some bit of code, you write:

import pdb; pdb.set_trace()

and if you want to use a fancier debugger (ipdb is recommended):

import ipdb; ipdb.set_trace()


import ipdb;

This doesn’t work in jupyter/ipython, which has some other fancy interaction loop going on.

Here’s the manual way to drop into the debugger from code, according to noticed by Christoph Martin and David Hamann:

from IPython.core.debugger import Tracer; Tracer()()      # < 5.1
from IPython.core.debugger import set_trace; set_trace()  # >= v5.1

However, that’s not how you are supposed to do it. Persons of quality invoke their debuggers via so-called magics, e.g. the %debug magic to set a breakpoint.

%debug [--breakpoint filename:line_number_for_breakpoint]

Without the argument it activates post-mortem mode. Pish posh, who thinks in line-numbers? set_trace wastes less time for humans per default.

And if you want to drop automatically into the post mortem debugger for every error:

%pdb on


Props to Josh Devlin for explaining this and some other handy tips, and Gaël Varoquaux.

Gaël recommended some extra debuggers:

Useful debugger commands

! statement
Execute the (one-line) statement in the context of the current stack frame, even if it mirrors the name of a debugger command This is the most useful command, because the debugger parser is horrible and will always interpret anything it conceivable can as a debugger command instead of a python commmand, whcih is confusing and misleading. So sut preface everythign with ! to be safe.
h(elp) [command]
Print your location in current stack
d(own) [count]/up [count]
Move the current frame count (default one) levels down/ in the stack trace (to a newer frame).
b(reak) [([filename:]lineno | function) [, condition]]
The one that is tedious to do manually. Without argument, list all breaks and their metadata.
tbreak [([filename:]lineno | function) [, condition]]
Temporary breakpoint, which is removed automatically when it is first hit.
cl(ear) [filename:lineno | bpnumber [bpnumber ...]]
Clear specific or all breakpoints
disable [bpnumber [bpnumber ...]]/enable [bpnumber [bpnumber ...]]
disable is the same as clear, but you can re-enable
ignore bpnumber [count]
ignore a breakpoint a specified number of times
condition bpnumber [condition]
Set a new condition for the breakpoint
commands [bpnumber]
Specify a list of commands for breakpoint number bpnumber. The commands themselves appear on the following lines. Type end to terminate the command list.
Execute the next line, even if that is inside an invoked function.
Execute the next line in this function.
unt(il) [lineno]
continue to line lineno, or the next line with a highetr number than the current one
Continue execution until the current function returns.
Continue execution, only stop when a breakpoint is encountered.
j(ump) lineno
Set the next line that will be executed. Only available in the bottom-most frame. It is not possible to jump into weird places like the middle of a for loop.
l(ist) [first[, last]]
List source code for the current file.
ll | longlist
List all source code for the current function or frame.
Print the argument list of the current function.
p expression
Evaluate the expression in the current context and print its value.
pp expression
Like the p command, except the value of the expression is pretty-printed using the pprint module.
whatis expression
Print the type of the expression.
source expression
Try to get source code for the given object and display it.
display [expression]/undisplay [expression]
Display the value of the expression if it changed, each time execution stops in the current frame.
Start an interactive interpreter (using the code module) whose global namespace contains all the (global and local) names found in the current scope.
alias [name [command]]/unalias name

Create an alias called name that executes command.

As an example, here are two useful aliases from the manual, for the .pdbrc file:

# Print instance variables (usage ``pi classInst``)
alias pi for k in %1.__dict__.keys(): print("%1.",k,"=",%1.__dict__[k])
# Print instance variables in self
alias ps pi self
Pack up and go home

Memory leaks

Python 3 has tracemalloc built in. this is a powerful python memory analyser, although bare-bones. Mike Lin walks you though it. Benoit Bernard explains various options that run on older pythons, including, most usefully IMO, obgraph which draws you an actual diagram of where the leaking things are. More full features, Pympler provide GUI-backed memory profiling, including the magically handy thing of tracking referrers using its refbrowser.

Code injection

pyrasite injects code into running python processes, which enables more exotic debuggery, and realtime object mutation and stuff and of course, memory and performance profiling.


Maybe it’s not crashing, but taking too long? You want a profiler.

Easy mode: built-in profiler

Profile functions using cProfile:

import cProfile as profile
profile.runctx('print(predded.shape)', globals(), locals())

There are also memory allocation tools, although I’ve not used them and suspect they are no longer current.

Now visualise them using… uh… let me come back to that.

fancy/hip: py-spy


. It lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way. Py-Spy is extremely low overhead: it is written in Rust for speed and doesn’t run in the same process as the profiled Python program, nor does it interrupt the running program in any way. This means Py-Spy is safe to use against production Python code.[…]

This project aims to let you profile and debug any running Python program, even if the program is serving production traffic.[…]

Py-spy works by directly reading the memory of the python program using the process_vm_readv system call on Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows.

Figuring out the call stack of the Python program is done by looking at the global PyInterpreterState variable to get all the Python threads running in the interpreter, and then iterating over each PyFrameObject in each thread to get the call stack.

Native ipython can run profiler magically:

%%prun -D
files = glob.glob('*.txt')
for file in files:
    with open(file) as f:

snakeviz includes a handy magic to automatically save stats and launch the profiler. (Gotcha: you have to have the snakeviz cli already on the path when you launch ipython.)

%load_ext snakeviz
files = glob.glob('*.txt')
for file in files:
    with open(file) as f:

Tragically, this is incompatible with autoreload and gives weird errors if you run them both in the same session.

Visualising profiles

String formatting things that are unecessarily hard to discern from the manual

What a nightmare that manual for the string formatting. Whil all the information you need is in there, it is arranged in a perverse inversion of some mixture of the frequency and the priority with which you use it. See Marcus Kazmierczak’s cookbook instead.


## float precision
>>> print("{:.2f}".format(3.1415926))
## left padding
>>> print("{:0>2d}".format(5))
## power tip which the manual does not make clear:
## variable formatting
>>> pi = 3.1415926
>>> precision = 4
>>> print( "{:.{}f}".format( pi, precision ) )

Which Foreign Function Interface am I supposed to be using now?

Want to call a a function in C+, C++, FORTRAN etc from python?

If you are just talking to C, ctypes is a python library to translate python objects to c with minimal fuss, and no compiler requirement. See the ctypes tutorial.

And of course, if you have your compiler lying about, Python was made to talk to other languages and has a normal C API.

If you want something closer to python for your development process, Cython allows python compilation using a special syntax, and easy calling of foreign functions in one easy package. SWIG wraps function interfaces between various languages, but looks like a PITA; (See a comparison on stackoverflow).

There is also Boost.python if you want to talk to C++. Boost comes with lots of other fancy bits, like numerical libraries.

There are many other options, but in practice I’ve never needed to go further than cython, so I can’t even talk about all the options listed here knowledgeably.

Packaging and environments

Not so hard, but confusing and chaotic due to many long-running disputes only lately resolving.



The distribution you use if you want to teach a course in numerical python without dicking around with a 5 hour install process.

Has a slightly different packaging workflow. See, e.g. Tim Hoppper’s workflow which explains this environment.yml malarkey, or the creators’ rationale.

The upshot is if you want to install something with tricky dependencies like ViTables, you do this:

conda install pytables=3.2
conda install pyqt=4

Aside: Do you use fish shell? You need to do some extra setup. Specifically, add the line

source (conda info --root)/etc/fish/conf.d/

into ~/.config/fish/

NB Conda will fill up your hard disk if not regularly disciplined. via conda clean to stop your disk filling up with obsolete versions of obscure dependencies for that package you tried out that one time.

conda clean -pt

Python environment/version management

venv is now a builtin virtual python environemtn system in python 3. It doesn’t support python 2 but fixes various problems, e.g. it supports framework python on OSX which is very important for GUIs, and is covered by the python docs in the python virtual environment introduction.

# Create venv
python3 -m venv ~/.virtualenvs/learning_gamelan_keras_2
# Use venv from fish
source ~/.virtualenvs/learning_gamelan_keras_2/bin/
# Use venv from bash
source ~/.virtualenvs/learning_gamelan_keras_2/bin/activate

Python environment management management

One suggestion I’ve has is to use pyenv. which eases and automates switching between all the other weird python environments created by virtualenv, python, os python, anaconda python etc.



Too many bike sheds.

There are a lot of frameworks. The most common seem to be unittest, py.test and nose.

Asynchrony in python

See asynchronous python

Watching files for changes

Does this inotify solution work for non-linux? Because OSX uses FSEvents and windows uses I-don’t-even-know.

watchdog asserts that it is cross-platform. (source)

Python 2 vs 3

TODO: six versus future.

TLDR: I am no employee of giant enterprise-type business with a gigantic legacy code base, and so I don’t use python 2. My code is not python 2 compatible. Python 3 is more productive, and no-one is paying me to be less productive right now. Python 2 code is usually easy to port to python 3. It _is_ possible to write code which is compatible with python 2 and 3, but then I would miss out on lots of the work that has gone into making python 3 easier and better, and waste time porting elegant easy python 3 things to hard boring python 2 things.


Python 3.6 includes type hinting, and projects such as mypy support static analysis using type hints. There are not yet many tutorials on the details of this. Here’s one.

Short version: you go from this:

def fib(n):
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a+b

to this:

def fib(n: int) -> Iterator[int]:
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a+b

However, if you are going to this trouble, why not just use julia, which takes type-hinting a lot futher and will actualy use it to JIT-compile optimized code?

Which command line parser is the good one?

Another bike-shedding danger zone is command-line parsing, leading to the need to spend too much time parsing command line parsers rahter than parsing command lines.

Miscellaneous stuff I always need to look up

Misc recommendations