The Living Thing / Notebooks :


The least excruciating compromise between 1) irreproducible science, and 2) spooking your colleagues with something too new-fangled

jupyter notebook in action

The python-derived entrant in the scientific workbook field is called jupyter.

Works with python/julia/r/various. Jupyter allows easy(ish) online-publication-friendly worksheets, which are both interactive and easy to export for static online use. This is handy. So handy that it’s sometimes worth the many agonising broken things that you encounter while trying to benefit from this.

To install it, see the jupyter homepage.



Jupyter is a whole ecology of different language backend kernels talking to various frontend executors

Notebook classic

First kill its parenthesis molestation function with fire. Unless you like having to fight with your IDE’s assinine faith in its ability to read your mind. The setting is tricky to find, because it is not called “put syntax errors in my code without me asking Y/N”, but instead cm_config.autoCloseBrackets. According to a support ticket this should work.

# Run this in Python once, it should take effect permanently
from import ConfigManager
c = ConfigManager()
c.update('notebook', {"CodeCell": {"cm_config": {"autoCloseBrackets": False}}})

or add the following to your custom.js:

], function(Jupyter) {
    Jupyter.CodeCell.options_default.cm_config.autoCloseBrackets = false;

(That doesn’t work with jupyterlab, which would instead like you to go fuck yourself. Just wait for the syntax errors.)

So, onto other stuff.

  • Julius Schulz’s ultimate setup guide is also the ultimate pro tip compilation.

  • Jupyter classic is more usable if you install the notebook extensions, which includes, e.g. drag-and-drop image support.

    $ pip install --upgrade jupyter_contrib_nbextensions
    $ jupyter contrib nbextension install --user

    For example, if you run nbconvert to generate a HTML file, this image will remain outside of the html file. You can embed all images by using the calling nbconvert with the EmbedPostProcessor.

    $ jupyter nbconvert --post=embed.EmbedPostProcessor

    Update - broken in Jupyter 5.0

  • Wait, that was still pretty confusing; I need the notebook configurator whatsit.

    $ pip install --upgrade jupyter_nbextensions_configurator
    $ jupyter nbextensions_configurator enable --user
  • the location of theming, widgets CSS etc has moved of late; check your version number. The current location is ~/.jupyter/custom/custom.css, not the former location ~/.ipython/profile_default/static/custom/custom.css

Jupyter lab

jupyter lab is the current cutting edge, and reputedly is much nicer to develop plugins for than the notebook interface. From the user perspective it’s more or less the same thing, but the annoyances are different. It does not strictly dominate notebook in terms of user experience. I’m not a huge fan of how jupyter lab reinvents so many wheels. The mysterious curse of javascript development is that once you have tasted it, you are unable to resist an uncontrollable urge to reimplement something that already worked as a crappier javascript version. These folks attempt to reinvent copy, paste, search/replace, browser tabs and the command line, all of which were invented before and work fine, and moreover, because I am used to how they work, it would have be to an astonishingly gigantic improvement in each of those to be worth my learning their new system. Needless to say, gigantic improvements are not delivered, but rather some unintuitive tradeoffs like a search function which non-deterministically sometimes does regexp matching. I have a suspicion they want to re-invent text editors too, which would dpress me if so. You should to live with this because the developer API is supposed to be cleaner and easier to work with, so that’s probably good long term in terms of actual improvements the thing might deliver.

  • Personal peeve: As presaged, jupyter lab loves bracket molesting, and has made that particular form of syntax error introduction compulsory as a test of your commitment.

  • jupyerlab-toc allows you to navigate your lab notebook by markdown section headings.

    jupyter labextension install @jupyterlab/toc
  • integrated diagram editor? Someone integrated drawio as jupyterlab-drawio to prove a point.

    jupyter labextension install jupyterlab-drawio
  • latex editor? Note, I think this is a terrible idea. There are better editors than jupyter, better means of scientific communication than latex, and better specific latex tooling, but I will concede there is some kind of situation where this sweet spot of mediocrity might be useful, if only as a plot point in the script kind of highly contrived techno-thriller written by cloistered nerds.

    jupyter labextension install @jupyterlab/latex

Rich display

Various objects support rich display of python objects e.g. IPython.display.Image

from IPython.display import Image

or you can use markdown for local image display


If you want to make your own objects display, uh, richly, you can implement the appropriate magical methods

class Shout(object):
  def __init__(self, text):
      self.text = text

  def _repr_html_(self):
      return "<h1>" + self.text + "</h1>"

I leverage this to make a latex renderer called latex_fragment which you should totally check out for rendering inline algorithms, or for emitting SVG equations.

Custom kernels

jupyter looks for kernel specs in a kernel spec directory, depending on your platform.

Say your kernel is dan:

See the manual.

There is even a MATLAB bridge


Set up inline plots:

%matplotlib inline

inline svg:

%config InlineBackend.figure_format = 'svg'

Graph sizes are controlled by matplotlib. Here’s how to make big graphs:

import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10.0, 8.0)

Interesting-looking other extensions:

Jupyter lab includes such nifty features as a diagram editor which you can install using jupyter labextension install jupyterlab-drawio

Exporting notebooks

Presentations using Jupyter

Citations and other academic writing in Jupyter

tl;dr I did this for

  • my blog — using simple Zotero markdown citation export, which is not great for inline citations but fine for bibliographies, and very easy and robust.
  • my papers — using the ipypublish option, which works ok, but is annoying for citations
  • my papers — using the Pweave option, which works amazingly for everything if you use pandoc tricks for your citations.

I couldn’t find a unified approach for these two different use cases which didn’t sound like more work than it was worth. At least, many academics seem to have way more tedious and error-prone workflows than this, so I’m just being a fancy pants if I try to tweak it further.

  • Pweave by Matti Pastell is a clone of knitr:

    Pweave is a scientific report generator and a literate programming tool for Python. It can capture the results and plots from data analysis and works well with numpy, scipy and matplotlib.

    Documented by Max Masnick. Whee, executable markdown pages.

  • Chris Sewell has produced a scripted called ipypublish that eases some of the pain points in producing articles. It’s an impressive piece of work. (See the comments for some additional pro-tips for this.)

  • My own latex_fragment allows you to insert 1-off latex fragments into jupyter and pweave (e.g. algorithmic environments or some weird tikz thing)

  • Jean-François Bercher’s jupyter_latex_envs reimplements various latex markup as native jupyter including \cite.

  • Sylvain Deville recommends treating jupyter as a glorified markdown editor and then using pandoc, which is an OK workflow if you are producing a once-off paper, but not for a repeatedly updated blog.

  • nbconvert has built-in citation support but only for LaTeX output. Citations look like this:

    <cite data-cite="granger2013">(Granger, 2013)</cite>

    or even

    <strong data-cite="granger2013">(Granger, 2013)</strong>

    The template defines the bibliography source and looks like:

    ((*- extends 'article.tplx' -*))
    ((* block bibliography *))j
    ((( super () )))
    ((* endblock bibliography *))

    And building looks like:

    jupyter nbconvert --to latex --template=print.tplx mynotebook.ipynb

    As above, it helps to know how the document templates work.

    Note that even in the best case you don’t have access to natbib-style citation, so auto-year citation styles will look funky.

  • Speaking of custom templates, the nbconvert setup is customisable for more than latex.

    {% extends 'full.tpl'%}
    {% block any_cell %}
        <div style="border:thin solid red">
            {{ super() }}
    {% endblock any_cell %}
  • but how about for online? cite2c seems to do this by live inserting citations from zotero, including author-year stuff. (Requires Jupyter notebook 4.2 or better which might require a pip install --upgrade notebook)

    Julius Schulz gives a comprehensive config for this and everything else.

    This workflow is smooth for directed citing, but note that there is no way to include a bibliography except by citation, so you have to namecheck every article; and the citation keys it uses are zotero citation keys which are nothing like your bibtex keys so can’t really be manually edited.

  • if you are customising the output of jupyter’s nbconvert, you should be aware that the {% block output_prompt %} override doesn’t actually do anything in the templates I use. (Slides, HTML, LaTeX). Instead you need to use a config option:

    $ jupyter nbconvert --to slides some_notebook.ipynb \
       --TemplateExporter.exclude_output_prompt=True \
       --post serve

    I had to use the source to discover this.

  • ipyBibtex.ipynb? Looks like this:

    Lorem ipsum dolor sit amet
    consectetuer adipiscing elit,
    sed diam nonummy nibh euismod tincidunt
    ut laoreet dolore magna aliquam erat volutpat.

    So it supports natbib-style author-year citations! But it’s a small, unmaintained package so is risky.

  • work out how Mark Masden got citations working?

Interactive visualisations/simulations etc

Jupyter allows interactions! This is by far the easiest python UI system I have seen, for all that it is basic.

Official Manual: ipywidgets.

pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension

See also the announcement: Jupyter supports interactive JS widgets, where they discuss the data binding module in terms of javascript UI thingies.

Pro tip: If you want a list of widgets

from ipywidgets import widget

External event loops

External event loops are now easy and documented. What they don’t say outright is that if you want to use the tornado event loop, relax because both the jupyter server and the ipython kernel already use the pyzmq event loop which subclasses the tornado one.

If you want you make this work smoothly without messing around with passing ioloops everywhere, you should make zmq install itself as the default loop:

from zmq.eventloop import ioloop

Now, your asynchronous python should just work using tornado coroutines.

NB with the release of latest asyncio and tornado and various major version incopatibilities, I’m curious how smoothly this all still works.

Hosting live jupyter notebooks on the internet

Jupyter can host online notebooks, even multi-user notebook servers - if you are brave enough to let people execute weird code on your machine.

Commercial notebook hosts


This section is outdated. TBD; I should probably mention the ill-explained Kaggle kernels and google cloud ML execution of same, etc.

  • Here’s an example of how you would get live (dynamic) ones running on Amazon for free or cheap
  • sagemath runs notebooks online, with fancy features starting at $7/month. Messy design but tidy open-source ideals.
  • appears to be a python package development service, but they also have a sideline in hosting notebooks. ($7/month) Requires you to use their anaconda python distribution tools to work, which is… a plus and a minus. The anaconda python distro is simple for scientific computing, but if your hard disk is as full of python distros as mine is you tend not to want more confusing things and wasting disk space.

Miscellaneous tips and gotchas


This is all build on ipython so you invode the debugger ipython-style, specifically:

from IPython.core.debugger import Tracer; Tracer()()      # < 5.1
from IPython.core.debugger import set_trace; set_trace()  # >= v5.1

IOPub data rate exceeded.

You got this error and you weren’t doing anything that bandwidth intensive? Say, you were just viewing a big image, not a zillion images? It’s jupyter being conservative in version 5.0

jupyter notebook --generate-config
atom ~/.jupyter/

update the c.NotebookApp.iopub_data_rate_limit to be big, e.g. c.NotebookApp.iopub_data_rate_limit = 10000000.

This is fixed after 5.0.


jupyter diffing and merging is painful. Workaround: nbdime provides diffing and merging for notebooks. It has git integration:

nbdime config-git --enable --global

Offline mathjax in jupyter

python -m IPython.external.mathjax /path/to/source/