The Living Thing / Notebooks : Jupyter, the least excruciating compromise between irreproducible science, and horrifying your colleagues

jupyter notebook in action

The python-derived entrant in the scientific workbook field is jupyter.

Works with python/julia/r/various. Jupyter allows easy(ish) online-publication-friendly worksheets, which are both interactive and easy to export for static online use. Handy.

To install it, see the jupyter homepage.

UIs

The default jupyter ships as a browser-based coding environment. You can also access it using

Pro tips

Graphs

Set up inline plots:

%matplotlib inline

inline svg:

%config InlineBackend.figure_format = 'svg'

Graph sizes are controlled by matplotlib. Here’s how to make big graphs:

import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10.0, 8.0)

Interesting-looking other extensions:

Graphviz: it renders in jupyter. (see also other jupyter options)

IOPub data rate exceeded.

And you were just viewing a big image, not a zillion images? It’s jupyter being conservative in version 5.0

jupyter notebook --generate-config
atom ~/.jupyter/jupyter_notebook_config.py

Interactive visualisations/simulations etc

You’re looking for ipywidgets.

See also the announcement: Jupyter supports interactive JS widgets, which is by far the easiest python UI system I have seen for all that it is basic.

Pro tip: If you want a list of widgets

from ipywidgets import widget
widget.Widget.widget_types

Presentations using Jupyter

Exporting notebooks

Citations in jupyter

tl;dr I did this for

  • my blog using simple markdown citation export from zotero, which is not great for inline citations but fine for bibliographies, and very easy and robust.
  • my papers using the nbconvert option, which works fine.
  • Sylvain Deville recommends treating jupyter as a glorified markdown editor and then using pandoc, which is an OK workflow if you are producing a once-off paper, but not for a repeatedly updated blog.

  • nbconvert has built-in citation support but only for LaTeX output. Citations look like this:

    <cite data-cite="granger2013">(Granger, 2013)</cite>
    

    or even

    <strong data-cite="granger2013">(Granger, 2013)</strong>
    

    template defines the bibliography source and looks like:

    ((*- extends 'article.tplx' -*))
    
    ((* block bibliography *))j
    ((( super () )))
    \bibliographystyle{unsrt}
    \bibliography{refs}
    ((* endblock bibliography *))
    

    And building looks like:

    jupyter nbconvert --to latex --template=print.tplx mynotebook.ipynb
    

    As above, it helps to know how the document templates work.

    Note that even in the best case you don’t have access to natbib-style citation, so auto-year citation styles will look funky.

  • Speaking of custom templates, the nbconvert setup is customisable for more than latex.

    {% extends 'full.tpl'%}
    {% block any_cell %}
        <div style="border:thin solid red">
            {{ super() }}
        </div>
    {% endblock any_cell %}
    
  • but how about for online? cite2c seems to do this by live inserting citations from zotero, including author-year stuff. (Requires Jupyter notebook 4.2 or better which might require a pip install --upgrade notebook)

    Julius Schulz gives a comprehensive config for this and everything else.

    This workflow is smooth for directed citing, but note that there is no way to include a bibliography except by citation, so you have to namecheck every article; and the citation keys it uses are zotero citation keys which are nothing like your bibtex keys so can’t really be manually edited.

  • ipyBibtex.ipynb? Looks like this:

    %%cite
    Lorem ipsum dolor sit amet
    __\citep{hansen1982,crealkoopmanlucas2013}__,
    consectetuer adipiscing elit,
    sed diam nonummy nibh euismod tincidunt
    ut laoreet dolore magna aliquam erat volutpat.
    

    So it supports natbib-style author-year citations! But it’s a small, unmaintained package so is risky.

  • work out how Mark Masden got citations working?

Offline mathjax in jupyter

Hmmmm. Try:

python -m IPython.external.mathjax /path/to/source/mathjax.zip

Hosting live jupyter notebooks

The full version gives online notebooks, even multi-user notebook servers.

  • Here’s an example of how you would get live (dynamic) ones running on Amazon for free or cheap
  • sagemath runs notebooks online, with fancy features starting at $7/month. Messy design but tidy open-source ideals.
  • wakari still hosts notebooks online, and I think you can still give them $10/month to keep a notebook running for you. However, their purchase page is hard to find now that they are in the middle of being bought out. Here is it. Slick, but are they sticking around? They’ve recently removed all their fancy cluster plans so I don’t trust ‘em to keep on existing.
  • Anaconda.org appears to be a python package development service, but they also have a sideline in hosting notebooks. ($7/month) Requires you to use their anaconda python distribution tools to work, which is… a plus and a minus. The anaconda python distro is simple for scientific computing, but if your hard disk is as full of python distros as mine is you tend not to want more confusing things and wasting disk space.