The Living Thing / Notebooks :

Python

a programming language whose remarkable and rare feature is working like you imagine, if not how it should

A Swiss army knife of coding tools. Good matrix library, general scientific tools, statistics tools, web server, art tools, but, most usefully, interoperation with everything else - It wraps C, C++, Fortran, includes HTTP clients, parsers, API libraries, and all the other fruits of a thriving community. Fast enough, easy to debug, garbage-collected. If some bit is too slow, you compile it, otherwise, you relax. An excellent choice if you’d rather get stuff done than write code.

I typically do my stats and graphs in R, my user interface in javascript, my parallelism in java, and my linear algebra library is fortran but python is the thread that stitches this Frankensteinisch monster together.

Of course, it could be better. clojure is more elegant, scala is easier to parallelise, julia prioritises scientific work more highly… But in terms of using a damn-well-supported language that goes on your computer right now, and requires you to reinvent few wheels, and which is transferrable across number crunching, web development, UIs, text processing, graphics and sundry other domains, and does not require heavy licensing costs… this one is a good default choice.

Python version management for weird sciency distributions

One suggestion I’ve has is to use pyenv.

apparently I should also use virtualenv, which can create different projects within a global python version.

In addition, anaconda reckons their conda command is the best.

Humph.

I’m using virtualenv for now; it is the most common one and works fine.

ipython, the interactive python upgrade

The python-specific part of jupyter, which can also run without jupyter. Long story.

The main problem I forget here is

how to start the debugger

Let’s say there is a line in your code that fails:

1/0

In vanilla python if you want to debug the last exception (the post-mortem debugger) you do:

import pdb; pdb.pm()

and if you want to drop into a debugger from some bit of code, you write:

import pdb; pdb.set_trace()

and if you want to use a fancier debugger (ipdb is recommended):

import ipdb; ipdb.set_trace()

or:

import ipdb; ipdb.pm()

This doesn’t work in jupyter, which has some other fancy interaction loop going on.

Here’s one manual way to drop into the debugger from code, noticed by Christoph Martin

from IPython.core.debugger import Tracer; Tracer()()
1/0

However, that’s not how you are supposed to do it. Persons of quality invoke their debuggers via so-called magics, e.g. the %debug magic to set a breakpoint.

%debug [--breakpoint filename:line_number_for_breakpoint]

Without the argument it activates post-mortem mode. Seriously though, who thinks in line-numbers? Tracer realistically wastes less time.

And if you want to drop automatically into the post mortem debugger for every error:

%pdb on

1/0

Props to Josh Devlin for explaining this and some other handy tips, and Gaël Varoquaux.

Gaël recommended some extra debuggers:

Useful debug commands

h(elp) [command]
Guess
w(here)
Print your location in current stack
d(own) [count]/up [count]
Move the current frame count (default one) levels down/ in the stack trace (to a newer frame).
b(reak) [([filename:]lineno | function) [, condition]]
The one that is tedious to do manually. Without argument, list all breaks and their metadata.
tbreak [([filename:]lineno | function) [, condition]]
Temporary breakpoint, which is removed automatically when it is first hit.
cl(ear) [filename:lineno | bpnumber [bpnumber ...]]
Clear specific or all breakpoints
disable [bpnumber [bpnumber ...]]/enable [bpnumber [bpnumber ...]]
disable is the same as clear, but you can re-enable
ignore bpnumber [count]
ignore a breakpoint a specified number of times
condition bpnumber [condition]
Set a new condition for the breakpoint
commands [bpnumber]
Specify a list of commands for breakpoint number bpnumber. The commands themselves appear on the following lines. Type end to terminate the command list.
s(tep)
Execute the next line, even if that is inside an invoked function.
n(ext)
Execute the next line in this function.
unt(il) [lineno]
continue to line lineno, or the next line with a highetr number than the current one
r(eturn)
Continue execution until the current function returns.
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.
j(ump) lineno
Set the next line that will be executed. Only available in the bottom-most frame. It is not possible to jump into weird places like the middle of a for loop.
l(ist) [first[, last]]
List source code for the current file.
ll | longlist
List all source code for the current function or frame.
a(rgs)
Print the argument list of the current function.
p expression
Evaluate the expression in the current context and print its value.
pp expression
Like the p command, except the value of the expression is pretty-printed using the pprint module.
whatis expression
Print the type of the expression.
source expression
Try to get source code for the given object and display it.
display [expression]/undisplay [expression]
Display the value of the expression if it changed, each time execution stops in the current frame.
interact
Start an interactive interpreter (using the code module) whose global namespace contains all the (global and local) names found in the current scope.
alias [name [command]]/unalias name

Create an alias called name that executes command.

As an example, here are two useful aliases from the manual, for the .pdbrc file:

# Print instance variables (usage ``pi classInst``)
alias pi for k in %1.__dict__.keys(): print("%1.",k,"=",%1.__dict__[k])
# Print instance variables in self
alias ps pi self
! statement
Execute the (one-line) statement in the context of the current stack frame, even if it mirrors the name of a debugger command
q(uit)
Pack up and go home

Pretty display of objects

Check out the ipython display protocol which allows you to render objects as arbitrary graphics:

def _figure_data(self, format):
       fig, ax = plt.subplots()
       ax.plot(self.data, 'o')
       ax.set_title(self._repr_latex_())
       data = print_figure(fig, format)
       # We MUST close the figure, otherwise IPython's display machinery
       # will pick it up and send it as output, resulting in a double display
       plt.close(fig)
       return data

   # Here we define the special repr methods that provide the IPython display protocol
   # Note that for the two figures, we cache the figure data once computed.

   def _repr_png_(self):
       if self._png_data is None:
           self._png_data = self._figure_data('png')
       return self._png_data

For a non-graphical non-fancy terminal, you probably simply want nice formatting of dictionaries:

from pprint import pprint, pformat
pprint(obj)  # display it
print(pformat(obj))  # get a nicey formatted representation

Profiling

Profile functions using cProfile.

Now visualise them using… uh…

Visualising profiles

Foreign functions in python

Want to call a a function in C+, C++, FORTRAN etc from python? Possibly to go faster?

If you are just talking to C, ctypes is a python library to translate python objects to c with minimal fuss, and no compiler requirement. See the ctype tutorial.

And of course, if you have your compiler lying about, Python was made to talk to other languages and has (has always had) a normal C API.

If you want something closer to python for you development process, Cython allows some python compilation and easy calling of foreign functions. SWIG wraps function interfaces between various languages, but looks like a PITA; (See a comparison on stackoverflow).

There is also Boost.python if you want to talk to C++.

Miscellaneous stuff I always need to look up

Packaging

Not so hard, but confusing and chaotic due to many long-running disputes only lately resolving.

General

Anaconda

The distribution you use if you want to teach a path in numerical python without dicking around with a 5 hour install process.

Has a slightly different packaging workflow. See, e.g. Tim Hoppper’s workflow which explains this environment.yml malarkey, or the creators’ rational.

The upshot is if you want to install something with tricky dependencies like ViTables, you do this:

conda install pytables=3.2
conda install pyqt=4

Testing

Too many bike sheds.

There are a lot of frameworks. The most common seem to be unittest, py.test and nose.

Python 2 vs 3

TODO: six versus future.

Typing

python 3.6 includes type hinting, and projects such as mypy support static analysis using type hints. There are not yet many tutorials on the details of this, but for once tutsplus has one of the better ones.

Short version: you go from this:

def fib(n):
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a+b

to this:

def fib(n: int) -> Iterator[int]:
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a+b

which looks like a great idea from where I’m sitting.

Misc recommendations