A Swiss army knife of coding tools. Good matrix library, general scientific tools, statistics tools, web server, art tools, but, most usefully, interoperation with everything else - It wraps C, C++, Fortran, includes HTTP clients, parsers, API libraries, and all the other fruits of a thriving community. Fast enough, easy to debug, garbage-collected. If some bit is too slow, you compile it, otherwise, you relax. A good default if you’d rather get stuff done than write code.
Of course, it could be better. clojure is more elegant, scala is easier to parallelise, julia prioritises science more highly…
But in terms of using a well-supported language that goes on your computer right now, and requires you to reinvent few wheels, and which is transferable across number crunching, web development, UIs, text processing, graphics and sundry other domains, and does not require heavy licensing costs, this one is a good starting point.
What a pitch! Now, let’s look closer and see all the horrid things that are wrong with it.
Debugging, profiling and testing
See Python debugging, profiling and testing.
ipython, the interactive python upgrade
The python-specific part of jupyter, which can also run without jupyter. Long story.
The main thing I forget here is
Pretty display of objects
ipython display protocol
Check out the Rich display protocol which allows you to render objects as arbitrary graphics.
How to use this? The display api docs explain that you should basically implement methods such as
I made a thing called latex_fragment which leverages this to display arbitrary latex inline. This is how you use it:
def _figure_data(self, format): fig, ax = plt.subplots() ax.plot(self.data, 'o') ax.set_title(self._repr_latex_()) data = print_figure(fig, format) # We MUST close the figure, otherwise IPython’s display machinery # will pick it up and send it as output, resulting in a double display plt.close(fig) return data # Here we define the special repr methods that provide the IPython display protocol # Note that for the two figures, we cache the figure data once computed. def _repr_png_(self): if self._png_data is None: self._png_data = self._figure_data('png') return self._png_data
For a non-graphical non-fancy terminal, you probably simply want nice formatting of dictionaries etc:
Wait, you want to write your own pretty-printer, with correct indentation? Use tiles.
Pro tip: dotenv
dotenv allows easy configuration through OS environment variables or text files in the parent directory. You should probably use this. PRO-TIP: there are lots of packages with similar names. Make sure you install using this one
String formatting things that are unnecessarily hard to discern from the manual
What a nightmare is that manual for the string formatting. While all the information you need is in there, it is arranged in perverse inversion of some mixture of the frequency and the priority with which you use it. See Marcus Kazmierczak’s cookbook instead.
f-strings make things somewhat easier for python 3.6+ because they don’t need to mess around with naming things for the
Why is a now timestamp in UTC not the first line in every academic research workbook/paper/data analysis? Because it’s tedious to look up the different bits.
Here you go:
Rendering HTML output
You have a quick and dirty chunk o’ HTML you need to output? You aren’t writing some damnable webapp with a nested hierarchy of template rendering and CSS integration into some design framework? You just want to pump out some markup?
I recommend yattag which is fast, simple, good and has a 1-page manual. It works real good.
Rendering markdown as HTML
Which Foreign Function Interface am I supposed to be using now?
Want to call a a function in C+, C++, FORTRAN etc from python?
If you are just talking to C, ctypes is a python library to translate python objects to c with minimal fuss, and no compiler requirement. See the ctypes tutorial.
And of course, if you have your compiler lying about, Python was made to talk to other languages and has a normal C API.
If you want something closer to python for your development process, Cython allows python compilation using a special syntax, and easy calling of foreign functions in one easy package. SWIG wraps function interfaces between various languages, but looks like a PITA; (See a comparison on stackoverflow).
There is also
Boost.python if you want to talk to C++. Boost comes with lots of other fancy bits, like numerical libraries.
There are many other options, but in practice I’ve never needed to go further than
cython, so I can’t even talk about all the options listed here knowledgeably.
There are too many options for interfacing with external libraries and/or compiling python code.
ctypes, Cython, Boost-Python, numba, SWIG…
Lowish-friction, well tested, well-document works everywhere that Cpython extensions can be compiled. Compiles most python code (apart from generators and inner functions). Optimises python code using type defs and extended syntax. Here, read Max Burstein’s intro.
Highlights: It works seamlessly with numpy. It makes calling C-code easy
Problems: No generic dispatch. Debugging is nasty, like debugging C with extra crap in your way.
More specialised than cython, uses LLVM instead of the generic C compiler. Numba make optimising inner numeric loops easy.
Highlights: jit-compiles plain python, so it’s easy to use normal debuggers then switch on the compiler for performance improvements using the
@jit Generic dispatch using the
@generated_jit decorator. Compiles to multi-core vectorisations as well as CUDA. In principle this means you can do your calculations on the GPU.
Problems: LLVM is a shifty beast and sensitive version dependencies are annoying. Documentation is a bit crap, or at least unfriendly to outsiders. Practically, getting performance out of a GPU is trickier than working out you can optimise away one tedious matrix op, and doing it at this level is hard. There is too much messing with details of how many processors to allocate what to.
You might find it easier to use julia if a well-maintained and documented LLVM infrastructure is a real selling point for you.
Packaging and environments
Not so hard, but confusing and chaotic due to many long-running disputes only lately resolving.
Least nerdview guide ever: Vicki Boykis, Alice in Python projectland.
Simplest readable guide is python-packaging
Kenneth Reitz shows rather than tells with a heavily documented setup.py
Try Zed Shaw’s signature aggressively cynical and reasonably practical explanation of project structure, with bonus explication of how you should expect much time wasting yak shaving like this if you want to do software.
The distribution you use if you want to teach a course in numerical python without dicking around with a 5 hour install process.
Has a slightly different packaging workflow. See, e.g. Tim Hoppper’s workflow which explains this
environment.yml malarkey, or the creators’ rationale.
The upshot for the end user is that if you want to install something with tricky dependencies like ViTables, you do this:
Aside: Do you use fish shell? You need to do some extra setup. Specifically, add the line
source (conda info --root)/etc/fish/conf.d/conda.fish
NB Conda will fill up your hard disk if not regularly disciplined. via conda clean.
conda clean -pt
You might also want to not have the gigantic MKL library installed, of which I am in any case not a fan. You can usually disable it per request:
conda create -n pynomkl python nomkl
Note that the packagers claim MKL is a 100MB library, which is am error so large than it should raise concerns about their competence to distribute a numerical calculations package. 100MB? Poppycock. The package alone is 800MB, and it’s even bigger installed. That’s not counting the fact that conda will keep many many versions around if you are not assiduous in your
clonda clean habits. MKL alone was using about 10GB total on my machine when I last checked, which is two orders of magnitude off what the unwary might assume.
Reducing the harm of this kind of nonsense is one reason you should only ever install the minimalist miniconda as your base anaconda distribution, cautiously adding things as you need them.
Python environment/version management
venv is now a built-in python virtual environment system in python 3. It doesn’t support python 2 but fixes various problems, e.g. it supports framework python on macOS which is very important for GUIs, and is covered by the python docs in the python virtual environment introduction. It has a higher-level, er, …wrapper (?) called pipenv.
# Create venv python3 -m venv ~/.virtualenvs/learning_gamelan_keras_2 # Use venv from fish source ~/.virtualenvs/learning_gamelan_keras_2/bin/activate.fish # Use venv from bash source ~/.virtualenvs/learning_gamelan_keras_2/bin/activate
Python environment management management
One suggestion I’ve has is to use pyenv. which eases and automates switching between all the other weird python environments created by virtualenv, python.org python, os python, anaconda python etc.
BUT WHO MANAGES THE VIRTUALENV MANAGER MANAGER?
Asynchrony in python
See asynchronous python.
Watching files for changes
Does this inotify solution work for non-Linux? Because macOS uses FSEvents and Windows uses I-don’t-even-know.
watchdog asserts that it is cross-platform. (source)
Python 2 vs 3
Are you old? New to python 3?
Sebastian Raschka, The key differences between Python 2.7.x and Python 3.x with examples
Neat python 3 stuff
Alex Rogozhnikov, Python 3 with pleasure highlights some new tricks which landed recently.
Useful for us is a friendlier python struct-like thing, the data class Geir Arne Hjelle explains.
Override module accessors.
Asynchrony is less awful.
Python 2 and 3 compatibility
TL;DR: I am no employee of giant enterprise-type business with a gigantic legacy code base, and so I don’t use python 2. My code is not python 2 compatible. Python 3 is more productive, and no-one is paying me to be less productive right now. Python 2 code is usually easy to port to python 3. It is possible to write code which is compatible with python 2 and 3, but then I would miss out on some of the work that has gone into making python 3 easier and better, and waste time porting elegant easy python 3 things to hard boring python 2 things.
Writing python 2 and 3 compatible code.
Python 3.6+ includes type hinting, and projects such as mypy support static analysis using type hints. There are not yet many tutorials on the details of this. Here’s one.
Short version: you go from this:
However, if I am going to that much trouble, I would in fact rather julia, which takes type-hinting further, using it to JIT-compile optimised code.
Command line parsers
Another bike-shedding danger zone is command-line parsing, leading to the need to spend too much time parsing command line parsers rather than parsing command lines.
argparseis built-in to python stdlib and is OK, so why not just use that and avoid other dependencies? Answer: a dependency you might already have is likely to have introduced another CLI parsing library.
“Hydra is a framework for elegantly configuring complex applications”
Builds CLIs with autocomplete and other fun stuff.
is a Python package for creating beautiful command line interfaces in a composable way with as little code as necessary. It’s the “Command Line Interface Creation Kit”. It’s highly configurable but comes with sensible defaults out of the box.[…]
- arbitrary nesting of commands
- automatic help page generation
- supports lazy loading of subcommands at runtime
Aims to offer an alternative to the built-in argparse, which they regard as excessively magical. Its special feature is setuptools integration enabling installation of command-line tools from your current ipython virtualenv.
provides a clean, high level API for running shell commands and defining/organizing task functions from a tasks.py file[…] it offers advanced features as well – namespacing, task aliasing, before/after hooks, parallel execution and more.
argh was/is a popular extension to argparse
Argh is fully compatible with argparse. You can mix Argh-agnostic and Argh-aware code. Just keep in mind that the dispatcher does some extra work that a custom dispatcher may not do.
clip.py comes with a passive-aggressive app name, (+1) is all about wrapping generic python commands in command-line applications easily, much like
Miscellaneous stuff I always need to look up
tqdm seems to be a de facto standard.
Also works in jupyter
> pip install ipywidgets > jupyter nbextension enable --py widgetsnbextension > jupyter labextension install @jupyter-widgets/jupyterlab-manager
- Pext is a python extension to script handy things in a handy GUI.