The Living Thing / Notebooks :

Matrix factorisation, approximate

Forget QR and LU decompositions, there are now so many ways of factorising matrices that there are not enough acronyms in the alphabet to hold them, especially if you suspect your matrix is sparse, or could be made sparse because of some underlying constraint, or probably could, if squinted at in the right fashion, be such as a graph transition matrix, or Laplacian, or noisy transform of some smooth object, or at least would be very close to sparse or…

Your big matrix is the product (or sum, or…) of two matrices that are in some way simple (small-rank, small dimension, sparse), possibly with additional constraints. Can you find these simple matrices?

Here’s an example. Godec —- A decomposition into low-rank and sparse components. (looks combined multidimensional factorisation and outlier detection). Implementations for MATLAB and R exist.

A hip one from the previous decade is Non Negative Matrix factorisation (NMF), which I think is a classic. I should explain why.

There are so many of these things, depending on your preferred choice of loss function, free variable and whatever.

Keywords: Matrix sketching, low-rank approximation, Generalised version of traditional dimensionality reduction. There seems to be little contact with the related area of \(\mathcal{H}\)-matrix methods, as seen in, e.g. covariance matrices. Why is that?

(See for one lab’s backgrounder and their implementation, h2lib, hlibpro for a black-box closed-source one.)

Matrix concentration inequalities turn out to be crucial in making this work.

I would like to learn more about

Igor Carron’s Matrix Factorization Jungle classifies the following problems as matrix-factorisation type.

Kernel Factorizations
Spectral clustering
\(A = DX\) with unknown D and X, solve for sparse X and X_i = 0 or 1
K-Means / K-Median clustering
\(A = DX\) with unknown D and X, solve for XX^T = I and X_i = 0 or 1
Subspace clustering
\(A = AX\) with unknown X, solve for sparse/other conditions on X
Graph Matching
\(A = XBX^T\) with unknown X, B solve for B and X as a permutation
\(A = DX\) with unknown D and X, solve for elements of D,X positive
Generalized Matrix Factorization
\(W.*L - W.*UV'\) with W a known mask, U,V unknowns solve for U,V and L lowest rank possible
Matrix Completion
\(A = H.*L\) with H a known mask, L unknown solve for L lowest rank possible
Stable Principle Component Pursuit (SPCP)/ Noisy Robust PCA
\(A = L + S + N\) with L, S, N unknown, solve for L low rank, S sparse, N noise
Robust PCA
\(A = L + S\) with L, S unknown, solve for L low rank, S sparse
Sparse PCA
\(A = DX\) with unknown D and X, solve for sparse D
Dictionary Learning
\(A = DX\) with unknown D and X, solve for sparse X
Archetypal Analysis
\(A = DX\) with unknown D and X, solve for D = AB with D, B positive
Matrix Compressive Sensing (MCS)
find a rank-r matrix L such that \(A(L) ~= b\) / or \(A(L+S) = b\)
Multiple Measurement Vector (MMV)
\(Y = A X\) with unknown X and rows of X are sparse.
compressed sensing
\(Y = A X\) with unknown X and rows of X are sparse, X is one column.
Blind Source Separation (BSS)
\(Y = A X\) with unknown A and X and statistical independence between columns of X or subspaces of columns of X
Partial and Online SVD/PCA
Tensor Decomposition

Truncated Classic PCA is clearly also an example of this, but is excluded from the list for some reason. Boringness? the fact it’s a special case of Sparse PCA?

See also compressed sensing, deconvolution, optimisation, random features and clustering, sparse regression, generic solvers



“Sketching” is what I am going to use to describe the subset of factorisations which reduce the dimensionality of the matrices in question in a way I will make clear shortly.

Mart16 mentions CUR and interpolative decompositions. Does preconditioning fit here?

Randomized methods

Most of these algorithms have multiple optima and use a greedy search to find solutions; that is, they are deterministic up to choice of starting parameters.

There are also randomised versions,


“Enough theory! Plug the hip new toy into my algorithm!”


Vowpal Wabbit does this:

HPC for matlab, R, python, c++: libpmf:

LIBPMF implements the CCD++ algorithm, which aims to solve large-scale matrix factorization problems such as the low-rank factorization problems for recommender systems.


Julia: laplacians.jl:

Laplacians is a package containing graph algorithms, with an emphsasis on tasks related to spectral and algebraic graph theory. It contains (and will contain more) code for solving systems of linear equations in graph Laplacians, low stretch spanning trees, sparsifiation, clustering, local clustering, and optimization on graphs.

All graphs are represented by sparse adjacency matrices. This is both for speed, and because our main concerns are algebraic tasks. It does not handle dynamic graphs. It would be very slow to implement dynamic graphs this way.

Laplacians.jl was started by Daniel A. Spielman. Other contributors include Rasmus Kyng, Xiao Shi, Sushant Sachdeva, Serban Stan and Jackson Thea.

Matlab: Chih-Jen Lin’s nmf.m - “This tool solves NMF by alternative non-negative least squares using projected gradients. It converges faster than the popular multiplicative update approach. ”

c++: distributed nmf: In this repository, we offer both MPI and OPENMP implementation for MU, HALS and ANLS/BPP based NMF algorithms. This can run off the shelf as well easy to integrate in other source code. These are very highly tuned NMF algorithms to work on super computers. We have tested this software in NERSC as well OLCF cluster. The openmp implementation is tested on many different linux variants with intel processors. The library works well for both sparse and dense matrix. (KaBP16, Kann16, FKPB15)

C++/MATLAB/python: Spams includes some matrix factorisations in its sparse approx toolbox. (see optimisation)

python: scikit-learn does a few matrix factorisation of course, in its inimitable batteries-in-the-kitchen-sink way.

python: nimfa - “Nimfa is a Python library for nonnegative matrix factorization. It includes implementations of several factorization methods, initialization approaches, and quality scoring. Both dense and sparse matrix representation are supported.”

C++: Tapkee. Pro-tip —- even without coding, tapkee does a long list of nice dimensionality reduction from the CLI, some of which are explicitly manifold learners (and the rest are manifold learning which is not so different)

  • PCA and randomized PCA
  • Kernel PCA (kPCA)
  • Random projection
  • Factor analysis

Grassman formulation

Since the factorizations are constrained to matrices of various constrained structures this can boil down to clever differential geom formulations. Don’t know how that works.


Abdallah, S. A., & Plumbley, M. D.(2004) Polyphonic Music Transcription by Non-Negative Sparse Coding of Power Spectra. . Presented at the ISMIR
Achlioptas, D. (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences, 66(4), 671–687. DOI.
Aghasi, A., Nguyen, N., & Romberg, J. (2016) Net-Trim: A Layer-wise Convex Pruning of Deep Neural Networks. ArXiv:1611.05162 [Cs, Stat].
Arora, S., Ge, R., Halpern, Y., Mimno, D., Moitra, A., Sontag, D., … Zhu, M. (2012) A Practical Algorithm for Topic Modeling with Provable Guarantees. ArXiv:1212.4777 [Cs, Stat].
Bach, F. (2013a) Convex relaxations of structured matrix factorizations. ArXiv:1309.3117 [Cs, Math].
Bach, F., Jenatton, R., Mairal, J., & Obozinski, G. (2012) Optimization with Sparsity-Inducing Penalties. Foundations and Trends® in Machine Learning, 4(1), 1–106. DOI.
Bach, F. R.(2013b) Sharp analysis of low-rank kernel matrix approximations. In COLT (Vol. 30, pp. 185–209).
Batson, J., Spielman, D. A., & Srivastava, N. (2008) Twice-Ramanujan Sparsifiers. ArXiv:0808.0163 [Cs].
Bauckhage, C. (2015) k-Means Clustering Is Matrix Factorization. ArXiv:1512.07548 [Stat].
Berry, M. W., Browne, M., Langville, A. N., Pauca, V. P., & Plemmons, R. J.(2007) Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics & Data Analysis, 52(1), 155–173. DOI.
Bertin, N., Badeau, R., & Vincent, E. (2010) Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription. IEEE Transactions on Audio, Speech, and Language Processing, 18(3), 538–549. DOI.
Blei, D. M., Cook, P. R., & Hoffman, M. (2010) Bayesian nonparametric matrix factorization for recorded music. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 439–446).
Bruckstein, A. M., Elad, M., & Zibulevsky, M. (2008a) On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations. IEEE Transactions on Information Theory, 54(11), 4813–4820. DOI.
Bruckstein, A. M., Elad, M., & Zibulevsky, M. (2008b) Sparse non-negative solution of a linear system of equations is unique. In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008 (pp. 762–767). DOI.
Cao, B., Shen, D., Sun, J.-T., Wang, X., Yang, Q., & Chen, Z. (n.d.) Detect and Track Latent Factors with Online Nonnegative Matrix Factorization.
Carabias-Orti, J. J., Virtanen, T., Vera-Candeas, P., Ruiz-Reyes, N., & Canadas-Quesada, F. J.(2011) Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization. IEEE Journal of Selected Topics in Signal Processing, 5(6), 1144–1158. DOI.
Cichocki, A., Lee, N., Oseledets, I. V., Phan, A.-H., Zhao, Q., & Mandic, D. (2016) Low-Rank Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Problems: Perspectives and Challenges PART 1. ArXiv:1609.00893 [Cs].
Cichocki, A., Zdunek, R., & Amari, S. (2006) New Algorithms for Non-Negative Matrix Factorization in Applications to Blind Source Separation. In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings (Vol. 5, pp. V–V). DOI.
Cohen, A., Daubechies, I., & Feauveau, J.-C. (1992) Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 45(5), 485–560.
Combettes, P. L., & Pesquet, J.-C. (2008) A proximal decomposition method for solving convex variational. Inverse Problems, 24(6), 065014. DOI.
Dasarathy, G., Shah, P., Bhaskar, B. N., & Nowak, R. (2013) Sketching Sparse Matrices. ArXiv:1303.6544 [Cs, Math].
Dasgupta, S., & Gupta, A. (2003) An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms, 22(1), 60–65. DOI.
Desai, A., Ghashami, M., & Phillips, J. M.(2016) Improved Practical Matrix Sketching with Guarantees. IEEE Transactions on Knowledge and Data Engineering, 28(7), 1678–1690. DOI.
Devarajan, K. (2008) Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology. PLoS Comput Biol, 4(7), e1000029. DOI.
Ding, C. (n.d.) Principal Component Analysis (PCA) and Matrix Factorizations for Learning, ICML 2005 Tutorial.
Ding, C., He, X., & Simon, H. (2005) On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering. In Proceedings of the 2005 SIAM International Conference on Data Mining (pp. 606–610). Society for Industrial and Applied Mathematics
Ding, C., Li, T., & Jordan, M. I.(2010) Convex and Semi-Nonnegative Matrix Factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 45–55. DOI.
Dokmanić, I., & Gribonval, R. (2017) Beyond Moore-Penrose Part II: The Sparse Pseudoinverse. ArXiv:1706.08701 [Cs, Math].
Drineas, P., & Mahoney, M. W.(2005) On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research, 6, 2153–2175.
Dueck, D., Morris, Q. D., & Frey, B. J.(2005) Multi-way clustering of microarray data using probabilistic sparse matrix factorization. Bioinformatics, 21(suppl 1), i144–i151. DOI.
Fairbanks, J. P., Kannan, R., Park, H., & Bader, D. A.(2015) Behavioral clusters in dynamic graphs. Parallel Computing, 47, 38–50. DOI.
Févotte, C., Bertin, N., & Durrieu, J.-L. (2008) Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis. Neural Computation, 21(3), 793–830. DOI.
Flammia, S. T., Gross, D., Liu, Y.-K., & Eisert, J. (2012) Quantum Tomography via Compressed Sensing: Error Bounds, Sample Complexity, and Efficient Estimators. New Journal of Physics, 14(9), 095022. DOI.
Fung, W. S., Hariharan, R., Harvey, N. J. A., & Panigrahi, D. (2011) A General Framework for Graph Sparsification. In Proceedings of the Forty-third Annual ACM Symposium on Theory of Computing (pp. 71–80). New York, NY, USA: ACM DOI.
Gemulla, R., Nijkamp, E., Haas, P. J., & Sismanis, Y. (2011) Large-scale Matrix Factorization with Distributed Stochastic Gradient Descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 69–77). New York, NY, USA: ACM DOI.
Ghashami, M., Liberty, E., Phillips, J. M., & Woodruff, D. P.(2015) Frequent Directions : Simple and Deterministic Matrix Sketching. ArXiv:1501.01711 [Cs].
Gross, D. (2011) Recovering Low-Rank Matrices From Few Coefficients in Any Basis. IEEE Transactions on Information Theory, 57(3), 1548–1566. DOI.
Gross, D., Liu, Y.-K., Flammia, S. T., Becker, S., & Eisert, J. (2010) Quantum state tomography via compressed sensing. Physical Review Letters, 105(15). DOI.
Guan, N., Tao, D., Luo, Z., & Yuan, B. (2012a) NeNMF: an optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing, 60(6), 2882–2898.
Guan, N., Tao, D., Luo, Z., & Yuan, B. (2012b) Online Nonnegative Matrix Factorization With Robust Stochastic Approximation. IEEE Transactions on Neural Networks and Learning Systems, 23(7), 1087–1099. DOI.
Hackbusch, W. (2015) Hierarchical Matrices: Algorithms and Analysis. (1st ed.). Heidelberg New York Dordrecht London: Springer Publishing Company, Incorporated
Halko, N., Martinsson, P.-G., & Tropp, J. A.(2009) Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. ArXiv:0909.4061 [Math].
Hassanieh, H., Indyk, P., Katabi, D., & Price, E. (2012a) Nearly Optimal Sparse Fourier Transform. In Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing (pp. 563–578). New York, NY, USA: ACM DOI.
Hassanieh, H., Indyk, P., Katabi, D., & Price, E. (2012b) Simple and Practical Algorithm for Sparse Fourier Transform. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1183–1194). Kyoto, Japan: Society for Industrial and Applied Mathematics
Hoffman, M., Bach, F. R., & Blei, D. M.(2010) Online learning for latent dirichlet allocation. In advances in neural information processing systems (pp. 856–864).
Hoyer, P. O.(2002) Non-negative sparse coding. In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002 (pp. 557–565). DOI.
Hsieh, C.-J., & Dhillon, I. S.(2011) Fast Coordinate Descent Methods with Variable Selection for Non-negative Matrix Factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1064–1072). New York, NY, USA: ACM DOI.
Hu, T., Pehlevan, C., & Chklovskii, D. B.(2014) A Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization. In 2014 48th Asilomar Conference on Signals, Systems and Computers. DOI.
Huang, G., Kaess, M., & Leonard, J. J.(2013) Consistent sparsification for graph optimization. In 2013 European Conference on Mobile Robots (ECMR) (pp. 150–157). DOI.
Kannan, R. (2016) Scalable and distributed constrained low rank approximations.
Kannan, R., Ballard, G., & Park, H. (2016) A High-performance Parallel Algorithm for Nonnegative Matrix Factorization. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (p. 9:1–9:11). New York, NY, USA: ACM DOI.
Keriven, N., Bourrier, A., Gribonval, R., & Pérez, P. (2016) Sketching for Large-Scale Learning of Mixture Models. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6190–6194). DOI.
Keshava, N. (2003) A survey of spectral unmixing algorithms. Lincoln Laboratory Journal, 14(1), 55–78.
Khoromskij, B. N., Litvinenko, A., & Matthies, H. G.(2009) Application of hierarchical matrices for computing the Karhunen–Loève expansion. Computing, 84(1–2), 49–67. DOI.
Kim, H., & Park, H. (2008) Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method. SIAM Journal on Matrix Analysis and Applications, 30(2), 713–730. DOI.
Koren, Y., Bell, R., & Volinsky, C. (2009) Matrix Factorization Techniques for Recommender Systems. Computer, 42(8), 30–37. DOI.
Koutis, I., Miller, G. L., & Peng, R. (2012) A fast solver for a class of linear systems. Communications of the ACM, 55(10), 99–107. DOI.
Kruskal, J. B.(1964) Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29(2), 115–129. DOI.
Kumar, N. K., & Shneider, J. (2016) Literature survey on low rank approximation of matrices. ArXiv:1606.06511 [Cs, Math].
Lahiri, S., Gao, P., & Ganguli, S. (2016) Random projections of random manifolds. ArXiv:1607.04331 [Cs, q-Bio, Stat].
Lawrence, N. D., & Urtasun, R. (2009) Non-linear Matrix Factorization with Gaussian Processes. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 601–608). New York, NY, USA: ACM DOI.
Lee, D. D., & Seung, H. S.(1999) Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788–791. DOI.
Lee, D. D., & Seung, H. S.(2001) Algorithms for Non-negative Matrix Factorization. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in Neural Information Processing Systems 13 (pp. 556–562). MIT Press
Li, C.-K., & Poon, E. (2002) Additive Decomposition of Real Matrices. Linear and Multilinear Algebra, 50(4), 321–326. DOI.
Li, S. Z., Hou, X., Zhang, H., & Cheng, Q. (2001) Learning spatially localized, parts-based representation. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001 (Vol. 1, p. I-207-I-212 vol.1). DOI.
Liberty, E. (2013) Simple and Deterministic Matrix Sketching. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 581–588). New York, NY, USA: ACM DOI.
Liberty, E., Woolfe, F., Martinsson, P.-G., Rokhlin, V., & Tygert, M. (2007) Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences, 104(51), 20167–20172. DOI.
Lin, C.-J. (2007) Projected Gradient Methods for Nonnegative Matrix Factorization. Neural Computation, 19(10), 2756–2779. DOI.
Lin, Z. (n.d.) A Review on Low-Rank Models in Signal and Data Analysis.
Liu, T., & Tao, D. (2015) On the Performance of Manhattan Nonnegative Matrix Factorization. IEEE Transactions on Neural Networks and Learning Systems, PP(99), 1–1. DOI.
Liu, T., Tao, D., & Xu, D. (2016) Dimensionality-Dependent Generalization Bounds for $k$-Dimensional Coding Schemes. ArXiv:1601.00238 [Cs, Stat].
Mailhé, B., Gribonval, R., Vandergheynst, P., & Bimbot, F. (2011) Fast orthogonal sparse approximation algorithms over local dictionaries. Signal Processing, 91(12), 2822–2835. DOI.
Mairal, J., Bach, F., & Ponce, J. (2014) Sparse Modeling for Image and Vision Processing. Foundations and Trends® in Comput Graph. Vis., 8(2–3), 85–283. DOI.
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2009) Online Dictionary Learning for Sparse Coding. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 689–696). New York, NY, USA: ACM DOI.
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010) Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11, 19–60.
Martinsson, P.-G. (2016) Randomized methods for matrix computations and analysis of high dimensional data. ArXiv:1607.01649 [Math].
Martinsson, P.-G., Rockhlin, V., & Tygert, M. (2006) A randomized algorithm for the approximation of matrices. . DTIC Document
Mensch, A., Mairal, J., Thirion, B., & Varoquaux, G. (2017) Stochastic Subsampling for Factorizing Huge Matrices. ArXiv:1701.05363 [Math, q-Bio, Stat].
Needell, D., & Vershynin, R. (2009) Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit. Foundations of Computational Mathematics, 9(3), 317–334. DOI.
Nowak, W., & Litvinenko, A. (2013) Kriging and Spatial Design Accelerated by Orders of Magnitude: Combining Low-Rank Covariance Approximations with FFT-Techniques. Mathematical Geosciences, 45(4), 411–435. DOI.
Oymak, S., & Tropp, J. A.(2015) Universality laws for randomized dimension reduction, with applications. ArXiv:1511.09433 [Cs, Math, Stat].
Paatero, P., & Tapper, U. (1994) Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), 111–126. DOI.
Pan, G., Zhang, W., Wu, Z., & Li, S. (2014) Online Community Detection for Large Complex Networks. PLoS ONE, 9(7), e102799. DOI.
Rokhlin, V., Szlam, A., & Tygert, M. (2009) A Randomized Algorithm for Principal Component Analysis. SIAM J. Matrix Anal. Appl., 31(3), 1100–1124. DOI.
Rokhlin, V., & Tygert, M. (2008) A fast randomized algorithm for overdetermined linear least-squares regression. Proceedings of the National Academy of Sciences, 105(36), 13212–13217. DOI.
Ryabko, D., & Ryabko, B. (2010) Nonparametric Statistical Inference for Ergodic Processes. IEEE Transactions on Information Theory, 56(3), 1430–1435. DOI.
Schmidt, M. N., Larsen, J., & Hsiao, F.-T. (2007) Wind Noise Reduction using Non-Negative Sparse Coding. In 2007 IEEE Workshop on Machine Learning for Signal Processing (pp. 431–436). DOI.
Singh, A. P., & Gordon, G. J.(2008) A unified view of matrix factorization models. In Machine Learning and Knowledge Discovery in Databases (pp. 358–373). Springer
Soh, Y. S., & Chandrasekaran, V. (2017) A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers. ArXiv:1701.01207 [Cs, Math, Stat].
Sorzano, C. O. S., Vargas, J., & Montano, A. P.(2014) A survey of dimensionality reduction techniques. ArXiv:1403.2877 [Cs, q-Bio, Stat].
Spielman, D. A., & Teng, S.-H. (2004) Nearly-linear Time Algorithms for Graph Partitioning, Graph Sparsification, and Solving Linear Systems. In Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing (pp. 81–90). New York, NY, USA: ACM DOI.
Spielman, D. A., & Teng, S.-H. (2006) Nearly-Linear Time Algorithms for Preconditioning and Solving Symmetric, Diagonally Dominant Linear Systems. ArXiv:Cs/0607105.
Spielman, D. A., & Teng, S.-H. (2008a) A Local Clustering Algorithm for Massive Graphs and its Application to Nearly-Linear Time Graph Partitioning. ArXiv:0809.3232 [Cs].
Spielman, D. A., & Teng, S.-H. (2008b) Spectral Sparsification of Graphs. ArXiv:0808.4134 [Cs].
Spielman, D., & Srivastava, N. (2011) Graph Sparsification by Effective Resistances. SIAM Journal on Computing, 40(6), 1913–1926. DOI.
Sra, S., & Dhillon, I. S.(2006) Generalized Nonnegative Matrix Approximations with Bregman Divergences. In Y. Weiss, B. Schölkopf, & J. C. Platt (Eds.), Advances in Neural Information Processing Systems 18 (pp. 283–290). MIT Press
Sun, Y., & Stein, M. L.(2016) Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets. Journal of Computational and Graphical Statistics, 25(1), 187–208. DOI.
Tropp, J. A., Yurtsever, A., Udell, M., & Cevher, V. (2016) Randomized single-view algorithms for low-rank matrix approximation. ArXiv:1609.00048 [Cs, Math, Stat].
Tufts, D. W., & Kumaresan, R. (1982) Estimation of frequencies of multiple sinusoids: Making linear prediction perform like maximum likelihood. Proceedings of the IEEE, 70(9), 975–989. DOI.
Tung, F., & Little, J. J.(n.d.) Factorized Binary Codes for Large-Scale Nearest Neighbor Search.
Türkmen, A. C.(2015) A Review of Nonnegative Matrix Factorization Methods for Clustering. ArXiv:1507.03194 [Cs, Stat].
Vaz, C., Toutios, A., & Narayanan, S. S.(2016) Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data. (pp. 963–967). DOI.
Vincent, E., Bertin, N., & Badeau, R. (2008) Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 109–112). DOI.
Virtanen, T. (2007) Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 1066–1074. DOI.
Vishnoi, N. K.(2013) Lx = b. Foundations and Trends® in Theoretical Computer Science, 8(1–2), 1–141. DOI.
Wang, B., Hu, Y., Gao, J., Sun, Y., Chen, H., & Yin, B. (2017) Locality Preserving Projections for Grassmann manifold. In PRoceedings of IJCAI, 2017.
Wang, S., Gittens, A., & Mahoney, M. W.(2017) Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging. ArXiv:1702.04837 [Cs, Stat].
Wang, Y., & Jia, Y. (2004) Fisher non-negative matrix factorization for learning local features. In In Proc. Asian Conf. on Comp. Vision (pp. 27–30).
Wang, Y. X., & Zhang, Y. J.(2013) Nonnegative Matrix Factorization: A Comprehensive Review. IEEE Transactions on Knowledge and Data Engineering, 25(6), 1336–1353. DOI.
Woodruff, D. P.(2014) Sketching as a Tool for Numerical Linear Algebra. Foundations and Trends® in Theoretical Computer Science, 10(1–2), 1–157. DOI.
Woolfe, F., Liberty, E., Rokhlin, V., & Tygert, M. (2008) A fast randomized algorithm for the approximation of matrices. Applied and Computational Harmonic Analysis, 25(3), 335–366. DOI.
Yang, J., Meng, X., & Mahoney, M. W.(2015) Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments. ArXiv:1502.03032 [Cs, Math, Stat].
Yang, W., & Xu, H. (2015) Streaming Sparse Principal Component Analysis. In Journal of Machine Learning Research (pp. 494–503).
Yin, M., Gao, J., & Lin, Z. (2016) Laplacian Regularized Low-Rank Representation and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 504–517. DOI.
Yu, C. D., March, W. B., & Biros, G. (2017) An $N log N$ Parallel Fast Direct Solver for Kernel Matrices. In arXiv:1701.02324 [cs].
Yu, H.-F., Hsieh, C.-J., Si, S., & Dhillon, I. S.(2012) Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems. In IEEE International Conference of Data Mining (pp. 765–774). DOI.
Yu, H.-F., Hsieh, C.-J., Si, S., & Dhillon, I. S.(2014) Parallel matrix factorization for recommender systems. Knowledge and Information Systems, 41(3), 793–819. DOI.
Zass, R., & Shashua, A. (2005) A Unifying Approach to Hard and Probabilistic Clustering. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1 - Volume 01 (pp. 294–301). Washington, DC, USA: IEEE Computer Society DOI.
Zhang, K., Liu, C., Zhang, J., Xiong, H., Xing, E., & Ye, J. (2017) Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 615–623). New York, NY, USA: ACM DOI.
Zhang, X., Wang, L., & Gu, Q. (2017) Stochastic Variance-reduced Gradient Descent for Low-rank Matrix Recovery from Linear Measurements. ArXiv:1701.00481 [Stat].
Zhang, Z., Ding, C., Li, T., & Zhang, X. (2007) Binary matrix factorization with applications. In Seventh IEEE International Conference on Data Mining, 2007. ICDM 2007 (pp. 391–400). IEEE DOI.
Zhou, T., & Tao, D. (2011) Godec: Randomized low-rank & sparse matrix decomposition in noisy case.
Zhou, T., & Tao, D. (2012) Multi-label Subspace Ensemble. Journal of Machine Learning Research.