The Living Thing / Notebooks :

Automatic differentiation

Getting your computer to tell you the gradient of a function, without resorting to finite difference approximation.

There seems to be a lot of stuff to know here; Infinitesimal/Taylor series formulations and computational complexity. Reverse-mode, a.k.a. Backpropoagation, versus forward-mode etc. But for special cases you can ignore most of this.

There is a beautiful explanation of the basics by Sanjeev Arora and Tengyu Ma.

You might want to do this for optimisation, batch or SGD, especially in neural networks, matrix factorisations, variational approximation etc. This is not news these days, but it took a stunningly long time to become common; see, e.g. Justin Domschke, Automatic Differentiation: The most criminally underused tool in the potential machine learning toolbox?.

See also symbolic mathematical calculators.

Software

Refs

Amar98
Amari, S. (1998) Natural Gradient Works Efficiently in Learning. Neural Computation, 10(2), 251–276. DOI.
ADGH16
Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T., & de Freitas, N. (2016) Learning to learn by gradient descent by gradient descent. arXiv:1606.04474 [Cs].
CHBL15
Carpenter, B., Hoffman, M. D., Brubaker, M., Lee, D., Li, P., & Betancourt, M. (2015) The Stan Math Library: Reverse-Mode Automatic Differentiation in C++. arXiv Preprint arXiv:1509.07164.
Gile08
Giles, M. B.(2008) Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation. In C. H. Bischof, H. M. Bücker, P. Hovland, U. Naumann, & J. Utke (Eds.), Advances in Automatic Differentiation (pp. 35–44). Springer Berlin Heidelberg
Neid10
Neidinger, R. (2010) Introduction to Automatic Differentiation and MATLAB Object-Oriented Programming. SIAM Review, 52(3), 545–563. DOI.