The Living Thing / Notebooks : Numerical libraries

There are too many numerical libraries.

Here are some I am sifting through, skewed towards python-compatible ones.

General numerics

To mention: the LAPACK/Fortran ecology. (Scott Locklin’s summary will do, actually)

Most dense linear algebra work presently happens in something called LAPACK. If you want to calculate eigenvectors in Matlab, Incanter/Clojure, Python, R, Julia and what have you: it’s almost certainly getting piped through LAPACK. If you’re working in C, C++ or Fortran, you’re still probably doing in via LAPACK. If you’re doing linear algebra in Java, you are a lost soul, and you should seriously consider killing yourself, but the serious people I know who do this, they allocate blobs of memory and run LAPACK on the blobs. The same thing was true 25 years ago. In fact, people have basically been running their numerics code on LAPACK for over 40 years when it was called EISPACK and LINPACK. Quite a lot of it conforms with… Fortran 66; a language invented 50 years ago which is specifically formatted for punched cards. That is an astounding fact.

LAPACK is a sort of layercake on top of something called the BLAS (“Basic Linear Algebra Subroutines”). I think BLAS were originally designed as a useful abstraction so your Eigendecomposition functions have more elementary operations in common with your QR decomposition functions. This made LAPACK better than its ancestors from a code management and complexity point of view. It also made LAPACK faster, as it allowed one to use machine optimized BLAS when they are available, and BLAS are the type of thing a Fortran compiler could tune pretty well by itself, especially on old-timey architectures without complex cache latency. While most of these functions seem boring, they are in fact extremely interesting. Just to give a hint: there is a function in there for doing matrix-multiply via the Winograd version of the Strassen algorithm, which multiplies matrices in O(N^2.8) instead of O(N^3). It’s the GEMMS family of functions if you want to go look at it. Very few people end up using GEMMS, because you have to be a numerical linear algebraist to understand when it is useful enough to justify the O(N^0.2) speedup. For example, there are exactly no Lapack functions which call GEMMS. But GEMMS is there, implemented for you, just waiting for the moment when you might need to solve … I dunno, a generalized Sylvester equation where Strassen’s algorithm applies.

To mention: GSL.

Elemental is one entrant, supporting the newer things that modern matrix manipulation necessitates, including built-in sparse/compressive support, NNMF etc.

Elemental is an open-source library for distributed-memory dense and sparse-direct linear algebra and optimization which builds on top of BLAS, LAPACK, and MPI using modern C++ and additionally exposes interfaces to C and Python (with a Julia interface beginning development).

Armadillo is a popular C++ wrapper around the LAPACK tools, providing lazy automagic graph optimisation of operations, or at least, that’s what some earnest guy told me at a seminar. Super popular for less horrible matrix stuff in R.

Automatic differentiation

See automatic differentiation


Solvers” for large systems of equations are not really distinct from optimisation. But there are differences in general assumptions to finding an optimal vector solution to an equation which is presumably smaller than your data, and finite element methods and PDE wrangling, where the solution might be much larger than your data.

See solving all my problems.

GPU computation

Tensorflow again, numba, theano again… See GPU computation.