Newton-type optimization uses 2nd-order gradient information (i.e. a Hessian matrix) to solve optimiztion problems. Higher order optimisation uses 3rd order gradients and so on.
This is rarely done in problems that I face because
- 3rd order derivatives of multivariate optimisations are usually too big in time and space complexity to be tractable
- They are not (simply) expressible as matrices so can benefit from a little tensor theory.
- Other reasons I don’t know about.
I have nothing to say about this now, but for my own reference, a starting keyword is Halley-Chebyshev methods.