The Living Thing / Notebooks :

Manifold learning

Also topological data analysis; other hip names to follow

manifolds for genome search

Manifolds for genome search. Source: BeDY16.

As in - handling your high-dimensional, or graphical, data by trying to discover a low(er)-dimensional manifold that contains it. That is, inferring a hidden variable that happens to have the form of a smooth surface of some dimension. There are a million different versions of this. Multidimensional scaling seems to be the oldest.

Tangential aside: in dynamical systems we talk about creating very high dimensional Takens embedding for state space reconstruction for arbitrary nonlinear dynamics. I imagine there are some connections between learning the lower-dimensional manifold upon which lies your data, and the higher dimensional manifold in which your data’s state space is naturally expressed. But I would not be the first person to notice this, so hopefully it’s done for me somewhere?

See also wacky regression, kernel methods, which do regression on an implicit manifold, and functional regression where the manifold isn’t even necessarily low dimensional, although typically still smooth, in some sense.

See also information geometry, which doesn’t give you a manifold for your data, but a manifold in which the parametric model itself is embedded.

To look at: ISOMAP, Locally linear embedding, spectral embeddings, multidimensional scaling…

Bioinformatics is leading to some weird use of data manifolds; see for example BeDY16 for the performance implications of knowing the manifold shape for *omics search, using compressive manifold storage based on both fractal dimension and metric entropy concepts. Also suggestive connection with fitness landscape in evolution.

Neural networks do this implicitly too; see Christopher Olahs’s visual explanation how, whose diagrams should be stolen by someone trying to explain V-C dimension.

MoSF13 argue:

Manifold learning algorithms have recently played a crucial role in unsupervised learning tasks such as clustering and nonlinear dimensionality reduction[…] Many such algorithms have been shown to be equivalent to Kernel PCA (KPCA) with data dependent kernels, itself equivalent to performing classical multidimensional scaling (cMDS) in a high dimensional feature space (Schölkopf et al., 1998; Williams, 2002; Bengio et al., 2004).[…] Recently, it has been observed that the majority of manifold learning algorithms can be expressed as a regularized loss minimization of a reconstruction matrix, followed by a singular value truncation (Neufeld et al., 2012)

Implementations

TTK

TTK

The Topology ToolKit (TTK) is an open-source library and software collection for topological data analysis in scientific visualization.

TTK can handle scalar data defined either on regular grids or triangulations, either in 2D or in 3D. It provides a substantial collection of generic, efficient and robust implementations of key algorithms in topological data analysis. It includes:

  • For scalar data: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, topological simplification;
  • For bivariate scalar data: fibers, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces;
  • For uncertain scalar data: mandatory critical points;

scikit-learn

scikit-learn implements a grab-bag of algorithms

tapkee

C++: Tapkee. Pro-tip —- even without coding, tapkee does a long list of nice dimensionality reduction from the CLI, some of which are explicitly manifold learners (and the rest are matrix factorisations which is not so different)

  • Locally Linear Embedding and Kernel Locally Linear Embedding (LLE/KLLE)
  • Neighborhood Preserving Embedding (NPE)
  • Local Tangent Space Alignment (LTSA)
  • Linear Local Tangent Space Alignment (LLTSA)
  • Hessian Locally Linear Embedding (HLLE)
  • Laplacian eigenmaps
  • Locality Preserving Projections
  • Diffusion map
  • Isomap and landmark Isomap
  • Multidimensional scaling and landmark Multidimensional scaling (MDS/lMDS)
  • Stochastic Proximity Embedding (SPE)
  • PCA and randomized PCA
  • Kernel PCA (kPCA)
  • t-SNE
  • Barnes-Hut-SNE

To read

AsGD12
Aste, T., Gramatica, R., & Di Matteo, T. (2012) Exploring complex networks via topological embedding on surfaces. Physical Review E, 86(3), 036109. DOI.
AsBT11
Aswani, A., Bickel, P., & Tomlin, C. (2011) Regression on manifolds: Estimation of the exterior derivative. The Annals of Statistics, 39(1), 48–81. DOI.
BeNi03
Belkin, M., & Niyogi, P. (2003) Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation, 15(6), 1373–1396. DOI.
BeDY16
Berger, B., Daniels, N. M., & Yu, Y. W.(2016) Computational biology in the 21st century: scaling with compressive algorithms. Communications of the ACM, 59(8), 72–80. DOI.
BhBh08
Bhattacharya, A., & Bhattacharya, R. (2008) Nonparametric statistics on manifolds with applications to shape spaces. . ProQuest
CISZ08
Carlsson, G., Ishkhanov, T., Silva, V. de, & Zomorodian, A. (2008) On the Local Behavior of Spaces of Natural Images. International Journal of Computer Vision, 76(1), 1–12. DOI.
CSPW10
Chen, M., Silva, J., Paisley, J., Wang, C., Dunson, D., & Carin, L. (2010) Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds. IEEE Transactions on Signal Processing, 58(12), 6140–6155. DOI.
Devo98
DeVore, R. A.(1998) Nonlinear approximation. Acta Numerica, 7, 51–150. DOI.
DiFr84
Diaconis, P., & Freedman, D. (1984) Asymptotics of Graphical Projection Pursuit. The Annals of Statistics, 12(3), 793–815.
DiFr86
Diaconis, P., & Freedman, D. (1986) On the Consistency of Bayes Estimates. The Annals of Statistics, 14(1), 1–26.
DoGr03
Donoho, D. L., & Grimes, C. (2003) Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences, 100(10), 5591–5596. DOI.
FDKV07
Freund, Y., Dasgupta, S., Kabra, M., & Verma, N. (2007) Learning the structure of manifolds using random projections. In Advances in Neural Information Processing Systems (pp. 473–480).
GaMa12
Gashler, M., & Martinez, T. (2012) Robust manifold learning with CycleCut. Connection Science, 24(1), 57–69. DOI.
Gash12
Gashler, M. S.(2012) Advancing the Effectiveness of Non-linear Dimensionality Reduction Techniques. . Brigham Young University, Provo, UT, USA
HaLi93
Hall, P., & Li, K.-C. (1993) On almost Linearity of Low Dimensional Projections from High Dimensional Data. The Annals of Statistics, 21(2), 867–889.
HaKD13
Hawe, S., Kleinsteuber, M., & Diepold, K. (2013) Analysis operator learning and its application to image reconstruction. IEEE Transactions on Image Processing, 22(6), 2138–2150. DOI.
HeNi04
He, X., & Niyogi, P. (2004) Locality preserving projections. In Advances in neural information processing systems 16: proceedings of the 2003 conference (Vol. 16, p. 153).
HKKM10
Huckemann, S. F., Kim, P. T., Koo, J.-Y., & Munk, A. (2010) Möbius deconvolution on the hyperbolic plane with application to impedance density estimation. The Annals of Statistics, 38(4), 2465–2498. DOI.
KeTe08
Kemp, C., & Tenenbaum, J. B.(2008) The discovery of structural form. Proceedings of the National Academy of Sciences, 105(31), 10687–10692. DOI.
MoSF13
Moustafa, K. A.-, Schuurmans, D., & Ferrie, F. (2013) Learning a Metric Space for Neighbourhood Topology Estimation: Application to Manifold Learning. (pp. 341–356). Presented at the Asian Conference on Machine Learning
MuWZ10
Mukherjee, S., Wu, Q., & Zhou, D.-X. (2010) Learning gradients on manifolds. Bernoulli, 16(1), 181–207. DOI.
RoSa00
Roweis, S. T., & Saul, L. K.(2000) Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326. DOI.
SaRo03
Saul, L. K., & Roweis, S. T.(2003) Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds. The Journal of Machine Learning Research, 4, 119–155. DOI.
ScSM97
Schölkopf, B., Smola, A., & Müller, K.-R. (1997) Kernel principal component analysis. In W. Gerstner, A. Germond, M. Hasler, & J.-D. Nicoud (Eds.), Artificial Neural Networks — ICANN’97 (pp. 583–588). Springer Berlin Heidelberg DOI.
ScSM98
Schölkopf, B., Smola, A., & Müller, K.-R. (1998) Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 10(5), 1299–1319. DOI.
ShJe09
Shaw, B., & Jebara, T. (2009) Structure Preserving Embedding. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 937–944). New York, NY, USA: ACM DOI.
ShHA11
Shieh, A. D., Hashimoto, T. B., & Airoldi, E. M.(2011) Tree preserving embedding. Proceedings of the National Academy of Sciences, 108(41), 16916–16921. DOI.
SWMS99
Smola, A. J., Williamson, R. C., Mika, S., & Schölkopf, B. (1999) Regularized Principal Manifolds. In P. Fischer & H. U. Simon (Eds.), Computational Learning Theory (pp. 214–229). Springer Berlin Heidelberg
SoTa10
Song, D., & Tao, D. (2010) Biologically inspired feature manifold for scene classification. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 19(1), 174–184. DOI.
StHe09
Steinke, F., & Hein, M. (2009) Non-parametric regression between manifolds. In Advances in Neural Information Processing Systems 21 (pp. 1561–1568). Curran Associates, Inc.
TeSL00
Tenenbaum, J. B., de Silva, V., & Langford, J. C.(2000) A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500), 2319–2323.
WeSS04
Weinberger, K. Q., Sha, F., & Saul, L. K.(2004) Learning a Kernel Matrix for Nonlinear Dimensionality Reduction. In Proceedings of the Twenty-first International Conference on Machine Learning (p. 106–). New York, NY, USA: ACM DOI.
Will01
Williams, C. K. I.(2001) On a Connection between Kernel PCA and Metric Multidimensional Scaling. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in Neural Information Processing Systems 13 (Vol. 46, pp. 675–681). MIT Press DOI.
WGMM10
Wu, Q., Guinney, J., Maggioni, M., & Mukherjee, S. (2010) Learning gradients: predictive models that infer geometry and statistical dependence. The Journal of Machine Learning Research, 11, 2175–2198.
YNKZ12
Yu, Y., Neufeld, J., Kiros, R., Zhang, X., & Schuurmans, D. (2012) Regularizers versus Losses for Nonlinear Dimensionality Reduction: A Factored View with New Convex Relaxations. In ICML 2012.
ZhTW11
Zhou, T., Tao, D., & Wu, X. (2011) Manifold elastic net: a unified framework for sparse dimension reduction. Data Mining and Knowledge Discovery, 22(3), 340–371.