The Living Thing / Notebooks :

Optimisation, Higher order

Newton-type optimization uses 2nd-order gradient information (i.e. a Hessian matrix) to solve optimiztion problems. Higher order optimisation uses 3rd order gradients and so on.

This is rarely done in problems that I face because

  1. 3rd order derivatives of multivariate optimisations are usually too big in time and space complexity to be tractable
  2. They are not (simply) expressible as matrices so can benefit from a little tensor theory.
  3. Other reasons I don’t know about.

I have nothing to say about this now, but for my own reference, a starting keyword is Halley-Chebyshev methods.