General number crunching/data analysis packages. (Specialist software is dealt with elsewhere.
see machine vision, machine listening, gesture recognition, scientific workflow etc)
- R is a galaxy of statistical packages.
- python is likewise a whole world of its own, and so has its own page, and a sub-notebook just for statistical issues.
- Sundry deep learning packages are on their own page.
- Scala’s ecosystem is growing here
- so is julia’s.
- Shogun (C++) is a giant umbrella ML library with lots of algorithms and much muscular l33+ ML attitude.
- weka is a Java equivalent to Shogun I suppose; Lots of algorithms in a huge package. It has a stream extension called MOA which makes it into something more like Vowpal Rabbit. Regarding which…
- Vowpal Rabbit despite its abstruse project description, seems to be a good library for out-of-core linear learning (i.e. regression or classification from non-stupendously large machine using a stupendously large data set). Approaches include various online (that is, out-of-core) optimisations. L1/L2 regularisation. Linear or logistic models. (i.e. linear models). Squared, hinge, logistic, or quantiles losses. (Has a python binding btw, doesn’t everything?)
- bash data science command line.