The Living Thing / Notebooks :

Plotting stuff in julia

Usefulness: 🔧 🔧
Novelty: 💡
Uncertainty: 🤪 🤪 🤪
Incompleteness: 🚧 🚧 🚧

Julia is a hip language for computational types; there for it has cut-throat gangs of competing hip graphing technologies.

The julia wiki includes worked examples using various engines.

A curious feature of many julia toolkits is that they tend to produce SVG graphics per default, which easily become gigantic for even medium-large data sets and start flogging your CPU/RAM real hard. This is beautiful for printing, or for small data. But not for arbitrary data. For exploratory data analysis you will probably want to disable SVG in favour of some rasterised format like PNG, unless you are careful. Same goes for “interactive” JS stuff - it’s only interactive if your browser can render it without crashing. Work that out by rendering to PNG before you try the fancy SVG stuff.

Gadfly

The aspirational ggplot clone is Gadfly. It’s elegant, pretty, well integrated into the statistics DataFrame ecosystem, but missing some stuff, and has some weird gotchas.

I had to switch Gadfly from SVG to PNG or similar rasterised format, as presaged, to avoid bloated jupyter notebooks. First I needed to install the Fontconfig and Cairo bonus libraries

Pkg.add("Gadfly")
Pkg.add("Cairo")
Pkg.add("Fontconfig")

Now to force PNG output:

draw(
    PNG(150, 150),
    plot(data,  x=:gamma, y=:p, Geom.point, Geom.line)
)

Another weird characteristic is that Gadfly seemed slow on initial startup; This is survivable. But even after startup iut does seem slow on some basic stuff. If I histogram a million data points it takes 30 seconds for each plot of that histogram. It saps the utility of an elegant graph grammar if it’s not responsive to your adjustments. I wonder if this can be improved?

Gadfly is based on a clever declarative vector graphics system called Compose.jl, which might be independently useful.

Plots.jl

Plots.jl wraps many other plotting packages as backends, although notably not Gadfly. The author explains this is because Gadfly does not support 3d or interactivity, although since I want neither of those things in general, especially for publication, this is a contraindication for the compatibility of our mutual priorities. I have had enough grief trying to persuade various mathematics department that this whole “internet” thing is anything other than a PDF download engine; I don’t need to compromise my print output for animation, of all the useless fripperies. We tell you, this “multimedia” thing is a fad, and the punters will be going back to quill and vellum soon enough.

Anyway, one can totally shoehorn Plots into print-quality CMYK plots as well as web stuff, so disregard my grumping.

The Plots documentation is not fantastic, since it’s notionally simply wrapping some other plot libraries, and should defer to them. Except of course they all have their own terminology and APIs and what-have-you, so the whole system is confusing. You can more-or-less work it out by perusing the attributes documentation, checking specific backend examples, e.g. GR.

Plots has a rich extensions ecosystem. PlotRecipes and StatPlots use the “Recipes” system defined in RecipesBase) to provide a macro(?)-based data-specific plotting tools. For example, StatPlots causes Plots to have sensible DataFrame support.

Table-like data structures, […] are supported thanks to the macro (???) which allows passing columns as symbols.

using StatPlots
using DataFrames, IndexedTables
gr(size=(400,300))
df = DataFrame(a = 1:10, b = 10 .* rand(10), c = 10 .* rand(10))
@df df plot(:a, [:b :c], colour = [:red :blue])

Now, some backends.

GR

My default `Plots.jl`` backend is GR. This wraps GR.jl which in turn wraps GR, a cross-platform visualisation framework:

GR is a universal framework for cross-platform visualization applications. It offers developers a compact, portable and consistent graphics library for their programs. Applications range from publication quality 2D graphs to the representation of complex 3D scenes.[…]

GR is essentially based on an implementation of a Graphical Kernel System (GKS) and OpenGL. […] The GR framework is especially suitable for real-time environments.

Anyway, here is one important tip: if you aren’t rendering graphs for publication output, but for say exploratory data analysis, switch to PNG from SVG because SVG is very large for images with lots of details.

ENV["GKS_ENCODING"] = "UTF-8"
using Plots: gr, default
gr()
default(fmt=:png)

Note that while I was there I set the character encoding because otherwise GR has weak character support. There are other workarounds such as inbuilt greek characters and LaTeX:

using LaTeXStrings, Plots
plot(rand(20), lab=L"a = \beta = c", title=L"before\;f(\theta)\;and\;after")

If you want to suppress latex rendering, ensure that your label string does not both start and end with $. I do this by padding with a trailing space.

If anything is flaky on first execution, you might need to check the GR installation instructions which includes such steps (on Ubuntu) as:

apt install libxt6 libxrender1 libxext6 libgl1-mesa-glx libqt5widgets5

The observant will notice that this requires root access on the machine. I’m sure there must be a workaround but I can’t be arsed discovering it right now.

Also, every time the version of GR/ GR.jl increments there is a loooong recompilation process, which seems to be single-threaded and takes many minutes on my fancy 8-core machine. So be aware that it is not fast in every sense.

For all its smoothness and speed when it is up and running, GR Plots are not IMO all that beautiful and it is not clear how to make them beautiful, since beauty is hidden down at the toolkit layer. There is some deep metaphor here.

Plotly

Plots also targets Javascript online back ends via Plotly, which is neat although I have no use for it at the moment. As mentioned previously, in my department this “online” nonsense is about as popular as communicating data through modulated flatulence, in that the two are approximately equivalent in terms of their contributions toward our professional performance metrics.

InspectDR

InspectDR does interaction-focusses plots that lean towards signal processing, simulation and time series analysis. Which is my jam.

Makie

Makie is an OpenGL-backed visualisation library so probably does great on screen-quality 3d and badly on print-quality 2d. Haven’t tried it, since learning the Plots.jl and Gadfly.jl APIs has filled up my brain.

Vega

Vega.jl binds to a javascript ggplot-alike, vega. There is a “higher level” binding also called VegaLite.jl.

It’s unclear to me how either of these work with print graphics, and they are both lagging behind the latest version of the underlying Vega library, so I’m wary of them.

Others

Winston.jl has a brutalist simple approach but seems to go.