Stand by for higgledypiggledy notes on the theme of exploiting sparsity to recover signals from few measurements. Sparse regression in a deterministic setting (i.e. where we aren’t doing a regresssion from noisy data; Although it might stil be random if we take random measurements.)
See also matrix factorisations, random projections, optimisation, model selection, multiple testing, random linear algebra, concentration inequalities.
TBD: The Restricted Isometry Properties.
Intros
Terry Tao’s exposition is great as usual.

[…]we can at least give an informal geometric argument as to why \(\ell^1\) minimisation is more likely to recover a sparse solution than \(\ell^2\) minimisation. The set of all \(f\) whose Fourier coefficients match the observed data \(c_\xi\) forms an affine subspace of the space of all functions. The \(\ell^2\) minimiser can then be viewed geometrically by taking l^2 balls (i.e. Euclidean balls) centred at the origin, and gradually increasing the radius of the ball until the first point of contact with the affine subspace. In general, there is no reason to expect this point of contact to be sparse (i.e. to lie on a highcodimension coordinate subspace). If however we replace \(\ell^2\) with \(\ell^1\), then the Euclidean balls are replaced by octahedra, which are much “pointier” (especially in high dimensions) and whose corners lie on coordinate subspaces. So the point of first contact is now much more likely to be sparse. The idea of using \(\ell^1\) as a “convex relaxation” of \(\ell^0\) is a powerful one in applied mathematics; see for instance Trop06 on the topic.
Hegde, Baraniuk, Davenport and Duarte have an open source textbook
Gabriel Peyre’s Compressed Sensing of Images
Igor Carron’s Sunday Morning Insight: A Quick Panorama of Sensing from Direct Imaging to Machine Learning
Random projections

LocalitySensitive Hashing (LSH) is an algorithm for solving the approximate or exact Near Neighbor Search in high dimensional spaces.
John Myles White explains why the JohnsonLindenstrauss lemma is worth knowing
Bob Durrant: The Unreasonable Effectiveness of Random Projections in Computer Science
Deterministic projection strategies
Surely this is close to quasi monte carlo?
Dustin G. Mixon Achieving the Welch bound with difference sets
I blogged about constructing harmonic frames using difference sets. We proved that such harmonic frames are equiangular tight frames, thereby having minimal coherence between columns. I concluded the entry by conjecturing that incoherent harmonic frames are as good for compressed sensing as harmonic frames whose rows were randomly drawn from the discrete Fourier transform (DFT) matrix
A variant on the compressed sensing of Yves Meyer
recent work of Yves Meyer might be relevant:
 A variant on the compressed sensing of Emmanuel Candes, Basarab Matei and Yves Meyer
 Simple quasicrystals are sets of stable sampling, Basarab Matei and Yves Meyer
These papers are interesting because their approach to compressed sensing is very different. Specifically, their sparse vectors are actually functions of compact support with sufficiently small Lebesgue measure. As such, concepts like conditioning are replaced with that of stable sampling, and the results must be interpreted in the context of functional analysis. The papers demonstrate that sampling frequencies according to a (deterministic) simple quasicrystal will uniquely determine sufficiently sparse functions, and furthermore, the sparsest function in the preimage can be recovered by L1minimization provided it’s nonnegative.
That phase transition
How well can you recover a matrix from a certain number of measurements? In obvious metrics there is a sudden jump in how well you do with increasing measurements for a given rank.
See Igor Carron’s “phase diagram” list, and stuff like this Weng et al “Overcoming The Limitations of Phase Transition by Higher Order Analysis of Regularization Techniques” (WeMZ16)
Refs
 Achl03
 Achlioptas, D. (2003) Databasefriendly random projections: JohnsonLindenstrauss with binary coins. Journal of Computer and System Sciences, 66(4), 671–687. DOI.
 AzKS15
 Azizyan, M., Krishnamurthy, A., & Singh, A. (2015) Extreme Compressive Sampling for Covariance Estimation. arXiv:1506.00898 [Cs, Math, Stat].
 BJMO12
 Bach, F., Jenatton, R., Mairal, J., & Obozinski, G. (2012) Optimization with SparsityInducing Penalties. Foundations and Trends® in Machine Learning, 4(1), 1–106. DOI.
 Bara07
 Baraniuk, R. G.(2007) Compressive sensing. IEEE Signal Processing Magazine, 24(4).
 Bara08
 Baraniuk, R. G.(2008) Singlepixel imaging via compressive sampling. IEEE Signal Processing Magazine, 25(2), 83–91. DOI.
 BCDH10
 Baraniuk, R. G., Cevher, V., Duarte, M. F., & Hegde, C. (2010) Modelbased compressive sensing. IEEE Transactions on Information Theory, 56(4), 1982–2001. DOI.
 BaSB10
 Baron, D., Sarvotham, S., & Baraniuk, R. G.(2010) Bayesian compressive sensing via belief propagation. IEEE Transactions on Signal Processing, 58(1), 269–280. DOI.
 BaMo11
 Bayati, M., & Montanari, A. (2011) The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Transactions on Information Theory, 57(2), 764–785. DOI.
 BeDY16
 Berger, B., Daniels, N. M., & Yu, Y. W.(2016) Computational biology in the 21st century: scaling with compressive algorithms. Communications of the ACM, 59(8), 72–80. DOI.
 BiCh13
 Bian, W., & Chen, X. (2013) WorstCase Complexity of Smoothing Quadratic Regularization Methods for NonLipschitzian Optimization. SIAM Journal on Optimization, 23(3), 1718–1741. DOI.
 BiMa01
 Bingham, E., & Mannila, H. (2001) Random Projection in Dimensionality Reduction: Applications to Image and Text Data. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 245–250). New York, NY, USA: ACM DOI.
 Blan13
 Blanchard, J. D.(2013) Toward deterministic compressed sensing. Proceedings of the National Academy of Sciences, 110(4), 1146–1147. DOI.
 BoSc16
 Borgerding, M., & Schniter, P. (2016) OnsagerCorrected Deep Networks for Sparse Linear Inverse Problems. arXiv:1612.01183 [Cs, Math].
 BrEZ08
 Bruckstein, A. M., Elad, M., & Zibulevsky, M. (2008) Sparse nonnegative solution of a linear system of equations is unique. In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008 (pp. 762–767). DOI.
 CaXZ08
 Cai, T. T., Xu, G., & Zhang, J. (2008) On Recovery of Sparse Signals via $ell_1$ Minimization. arXiv:0805.0149 [Cs].
 CaZh15
 Cai, T. T., & Zhang, A. (2015) ROP: Matrix recovery via rankone projections. The Annals of Statistics, 43(1), 102–138. DOI.
 Cand14
 Candès, E. J.(2014) Mathematics of sparsity (and few other things). ICM 2014 Proceedings, to Appear.
 CaDa11
 Candès, E. J., & Davenport, M. A.(2011) How well can we estimate a sparse vector?. arXiv:1104.5246 [Cs, Math, Stat].
 CaRe09
 Candès, E. J., & Recht, B. (2009) Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9(6), 717–772. DOI.
 CaRT06a
 Candès, E. J., Romberg, J. K., & Tao, T. (2006a) Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8), 1207–1223. DOI.
 CaRT06b
 Candès, E. J., Romberg, J., & Tao, T. (2006b) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509. DOI.
 CaTa06
 Candès, E. J., & Tao, T. (2006) NearOptimal Signal Recovery From Random Projections: Universal Encoding Strategies?. IEEE Transactions on Information Theory, 52(12), 5406–5425. DOI.
 CaTa00
 Candès, E. J., & Tao, T. (n.d.) The uniform uncertainty principle and compressed sensing.
 CaWa08
 Candès, E. J., & Wakin, M. B.(2008) An Introduction To Compressive Sampling. IEEE Signal Processing Magazine, 25(2), 21–30. DOI.
 Carm13
 Carmi, A. Y.(2013) Compressive system identification: sequential methods and entropy bounds. Digital Signal Processing, 23(3), 751–770. DOI.
 Carm14
 Carmi, A. Y.(2014) Compressive System Identification. In A. Y. Carmi, L. Mihaylova, & S. J. Godsill (Eds.), Compressed Sensing & Sparse Filtering (pp. 281–324). Springer Berlin Heidelberg DOI.
 CDHB09
 Cevher, V., Duarte, M. F., Hegde, C., & Baraniuk, R. (2009) Sparse Signal Recovery Using Markov Random Fields. In Advances in Neural Information Processing Systems (pp. 257–264). Curran Associates, Inc.
 ChYi08
 Chartrand, R., & Yin, W. (2008) Iteratively reweighted algorithms for compressive sensing. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008 (pp. 3869–3872). DOI.
 Chen12
 Chen, X. (2012) Smoothing methods for nonsmooth, nonconvex minimization. Mathematical Programming, 134(1), 71–99. DOI.
 ChZh13
 Chen, X., & Zhou, W. (2013) Convergence of the reweighted ℓ. Computational Optimization and Applications, 59(1–2), 47–61. DOI.
 Chre08
 Chretien, S. (2008) An Alternating l1 approach to the compressed sensing problem. arXiv:0809.0660 [Stat].
 Dasg00
 Dasgupta, S. (2000) Experiments with Random Projection. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (pp. 143–151). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
 DaGu03
 Dasgupta, S., & Gupta, A. (2003) An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms, 22(1), 60–65. DOI.
 DaHV12
 Dasgupta, S., Hsu, D., & Verma, N. (2012) A concentration theorem for projections. arXiv Preprint arXiv:1206.6813.
 DaDD04
 Daubechies, I., Defrise, M., & De Mol, C. (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics, 57(11), 1413–1457. DOI.
 DDFG10
 Daubechies, Ingrid, DeVore, R., Fornasier, M., & Güntürk, C. S.(2010) Iteratively reweighted least squares minimization for sparse recovery. Communications on Pure and Applied Mathematics, 63(1), 1–38. DOI.
 Devo98
 DeVore, R. A.(1998) Nonlinear approximation. Acta Numerica, 7, 51–150. DOI.
 DiFr84
 Diaconis, P., & Freedman, D. (1984) Asymptotics of Graphical Projection Pursuit. The Annals of Statistics, 12(3), 793–815.
 Dono06
 Donoho, D. L.(2006) Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306. DOI.
 DoMM09a
 Donoho, D. L., Maleki, A., & Montanari, A. (2009a) Message passing algorithms for compressed sensing: II analysis and validation. In 2010 IEEE Information Theory Workshop (ITW) (pp. 1–5). DOI.
 DoMM09b
 Donoho, D. L., Maleki, A., & Montanari, A. (2009b) Messagepassing algorithms for compressed sensing. Proceedings of the National Academy of Sciences, 106(45), 18914–18919. DOI.
 DoMM10
 Donoho, D. L., Maleki, A., & Montanari, A. (2010) Message passing algorithms for compressed sensing: I motivation and construction. In 2010 IEEE Information Theory Workshop (ITW) (pp. 1–5). DOI.
 DuBa13
 Duarte, M. F., & Baraniuk, R. G.(2013) Spectral compressive sensing. Applied and Computational Harmonic Analysis, 35(1), 111–129. DOI.
 FoSr11
 Foygel, R., & Srebro, N. (2011) Fastrate and optimisticrate error bounds for L1regularized regression. arXiv:1108.0373 [Math, Stat].
 FDKV07
 Freund, Y., Dasgupta, S., Kabra, M., & Verma, N. (2007) Learning the structure of manifolds using random projections. In Advances in Neural Information Processing Systems (pp. 473–480).
 GiSB16
 Giryes, R., Sapiro, G., & Bronstein, A. M.(2016) Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?. IEEE Transactions on Signal Processing, 64(13), 3444–3457. DOI.
 GrSi15
 Graff, C. G., & Sidky, E. Y.(2015) Compressive sensing in medical imaging. Applied Optics, 54(8), C23–C44.
 HaLi93
 Hall, P., & Li, K.C. (1993) On almost Linearity of Low Dimensional Projections from High Dimensional Data. The Annals of Statistics, 21(2), 867–889.
 HaJN15
 Harchaoui, Z., Juditsky, A., & Nemirovski, A. (2015) Conditional gradient algorithms for normregularized smooth convex optimization. Mathematical Programming, 152(1–2), 75–112. DOI.
 HIKP12a
 Hassanieh, Haitham, Indyk, P., Katabi, D., & Price, E. (2012a) Nearly Optimal Sparse Fourier Transform. In Proceedings of the Fortyfourth Annual ACM Symposium on Theory of Computing (pp. 563–578). New York, NY, USA: ACM DOI.
 HIKP12b
 Hassanieh, H., Indyk, P., Katabi, D., & Price, E. (2012b) Simple and Practical Algorithm for Sparse Fourier Transform. In Proceedings of the TwentyThird Annual ACMSIAM Symposium on Discrete Algorithms (pp. 1183–1194). Kyoto, Japan: Society for Industrial and Applied Mathematics
 HRLV10
 Hormati, A., Roy, O., Lu, Y. M., & Vetterli, M. (2010) Distributed Sampling of Signals Linked by Sparse Filtering: Theory and Applications. IEEE Transactions on Signal Processing, 58(3), 1095–1109. DOI.
 Hoye04
 Hoyer, P. O.(2004) Nonnegative Matrix Factorization with Sparseness Constraints. Journal of Machine Learning Research, 5(9), 1457–1469.
 Jagg13
 Jaggi, M. (2013) Revisiting FrankWolfe: ProjectionFree Sparse Convex Optimization. In Journal of Machine Learning Research (pp. 427–435).
 Kabá14
 Kabán, A. (2014) New Bounds on Compressive Linear Least Squares Regression. In Journal of Machine Learning Research (pp. 448–456).
 KiHa16
 Kim, D., & Haldar, J. P.(2016) Greedy algorithms for nonnegativityconstrained simultaneous sparse recovery. Signal Processing, 125, 274–289. DOI.
 LaGG16
 Lahiri, S., Gao, P., & Ganguli, S. (2016) Random projections of random manifolds. arXiv:1607.04331 [Cs, QBio, Stat].
 LiHC06
 Li, P., Hastie, T. J., & Church, K. W.(2006) Very Sparse Random Projections. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 287–296). New York, NY, USA: ACM DOI.
 LiOs09
 Li, Y., & Osher, S. (2009) Coordinate descent optimization for ℓ 1 minimization with application to compressed sensing; a greedy algorithm. Inverse Problems and Imaging, 3(3), 487–503.
 MaMe10
 Matei, B., & Meyer, Y. (2010) Simple quasicrystals are sets of stable sampling. Complex Variables and Elliptic Equations, 55(8–10), 947–964. DOI.
 MaMe00
 Matei, B., & Meyer, Y. (n.d.) A variant on the compressed sensing of Emmanuel Candes.
 MiEl10
 Mishali, M., & Eldar, Y. C.(2010) From Theory to Practice: SubNyquist Sampling of Sparse Wideband Analog Signals. IEEE Journal of Selected Topics in Signal Processing, 4(2), 375–391. DOI.
 MEDS09
 Mishali, M., Eldar, Y. C., Dounaevsky, O., & Shoshan, E. (2009) Xampling: Analog to Digital at SubNyquist Rates. arXiv:0912.2495 [Physics].
 MiEE09
 Mishali, M., Eldar, Y. C., & Elron, A. (2009) Xampling: Signal Acquisition and Processing in Union of Subspaces. arXiv:0911.0519 [Cs, Math].
 Mont12
 Montanari, A. (2012) Graphical models concepts in compressed sensing. Compressed Sensing: Theory and Applications, 394–438.
 MoBa17
 Mousavi, A., & Baraniuk, R. G.(2017) Learning to Invert: Signal Recovery via Deep Convolutional Networks. In ICASSP.
 NeTr08
 Needell, D., & Tropp, J. A.(2008) CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. arXiv:0803.2392 [Cs, Math].
 OkLa08
 Oka, A., & Lampe, L. (2008) Compressed sensing of GaussMarkov random field with wireless sensor networks. In 5th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2008. SAM 2008 (pp. 257–260). DOI.
 OlFi96
 Olshausen, B. A., & Field, D. J.(1996) Natural image statistics and efficient coding. Network (Bristol, England), 7(2), 333–339. DOI.
 OlFi04
 Olshausen, Bruno A, & Field, D. J.(2004) Sparse coding of sensory inputs. Current Opinion in Neurobiology, 14(4), 481–487. DOI.
 PaRa15
 Pawar, S., & Ramchandran, K. (2015) A robust sublinear time RFFAST algorithm for computing a sparse DFT. arXiv:1501.00320 [Cs, Math].
 PeEE10
 Peleg, T., Eldar, Y. C., & Elad, M. (2010) Exploiting Statistical Dependencies in Sparse Representations for Signal Recovery. IEEE Transactions on Signal Processing, 60(5), 2286–2303. DOI.
 RaBr15a
 Ravishankar, Saiprasad, & Bresler, Y. (2015a) Efficient Blind Compressed Sensing Using Sparsifying Transforms with Convergence Guarantees and Application to MRI. arXiv:1501.02923 [Cs, Stat].
 RaBr15b
 Ravishankar, S., & Bresler, Y. (2015b) Sparsifying Transform Learning With Efficient Optimal Updates and Convergence Guarantees. IEEE Transactions on Signal Processing, 63(9), 2389–2404. DOI.
 RiGr14
 Rish, I., & Grabarnik, G. (2014) Sparse Signal Recovery with ExponentialFamily Noise. In A. Y. Carmi, L. Mihaylova, & S. J. Godsill (Eds.), Compressed Sensing & Sparse Filtering (pp. 77–93). Springer Berlin Heidelberg DOI.
 RiGr15
 Rish, I., & Grabarnik, G. Y.(2015) Sparse modeling: theory, algorithms, and applications. . Boca Raton, FL: CRC Press, Taylor & Francis Group
 Romb08
 Romberg, J. (2008) Imaging via Compressive Sampling. IEEE Signal Processing Magazine, 25(2), 14–20. DOI.
 RoZh07
 Rosset, S., & Zhu, J. (2007) Piecewise linear regularized solution paths. The Annals of Statistics, 35(3), 1012–1030. DOI.
 RuPE13
 Rubinstein, R., Peleg, T., & Elad, M. (2013) Analysis KSVD: A DictionaryLearning Algorithm for the Analysis Sparse Model. IEEE Transactions on Signal Processing, 61(3), 661–677. DOI.
 SaBB06
 Sarvotham, S., Baron, D., & Baraniuk, R. G.(2006) Measurements vs Bits: Compressed Sensing meets Information Theory. In in Proc. Allerton Conf. on Comm., Control, and Computing.
 ScRa12
 Schniter, P., & Rangan, S. (2012) Compressive phase retrieval via generalized approximate message passing. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 815–822). DOI.
 ShTe11
 ShalevShwartz, S., & Tewari, A. (2011) Stochastic Methods for L1regularized Loss Minimization. J. Mach. Learn. Res., 12, 1865–1892.
 SFJJ15
 Smith, V., Forte, S., Jordan, M. I., & Jaggi, M. (2015) L1Regularized Distributed Optimization: A CommunicationEfficient PrimalDual Framework. arXiv:1512.04011 [Cs].
 SoXP15
 Song, R., Xie, Y., & Pokutta, S. (2015) Sequential Information Guided Sensing. arXiv:1509.00130 [Cs, Math, Stat].
 Trop06
 Tropp, J. A.(2006) Just relax: convex programming methods for identifying sparse signals in noise. IEEE Transactions on Information Theory, 52(3), 1030–1051. DOI.
 TrWr10
 Tropp, J. A., & Wright, S. J.(2010) Computational Methods for Sparse Solution of Linear Inverse Problems. Proceedings of the IEEE, 98(6), 948–958. DOI.
 Vett99
 Vetterli, M. (1999) Wavelets: approximation and compression–a review. In AeroSense’99 (Vol. 3723, pp. 28–31). International Society for Optics and Photonics DOI.
 WeVe12
 Weidmann, C., & Vetterli, M. (2012) Rate Distortion Behavior of Sparse Sources. IEEE Transactions on Information Theory, 58(8), 4969–4992. DOI.
 WeMZ16
 Weng, H., Maleki, A., & Zheng, L. (2016) Overcoming The Limitations of Phase Transition by Higher Order Analysis of Regularization Techniques. arXiv:1603.07377 [Cs, Math, Stat].
 WiNa16
 Wipf, D., & Nagarajan, S. (2016) Iterative Reweighted l1 and l2 Methods for Finding Sparse Solution. Microsoft Research.
 YNGD12
 Yaghoobi, M., Nam, S., Gribonval, R., & Davies, M. E.(2012) Noise aware analysis operator learning for approximately cosparse signals. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5409–5412). DOI.
 YaXu15
 Yang, W., & Xu, H. (2015) Streaming Sparse Principal Component Analysis. In Journal of Machine Learning Research (pp. 494–503).