Compressed sensing and sampling

A fancy ways of counting zero

August 18, 2014 — June 14, 2017

functional analysis

linear algebra

model selection

probabilistic algorithms

probability

signal processing

sparser than thou

Higgledy-piggledy notes on the theme of exploiting sparsity to recover signals from few non-local measurements, given that we know they are nearly sparse, in a sense that will be made clear soon.

1 Basic Compressed Sensing

I’ll follow the intro of (E. J. Candès et al. 2011), which tries to unify many variants.

We attempt to recover a signal \(x_k\in \mathbb{R}^d\) from \(m\ll n\) measurements \(y_k\) of the form

\[y_k =\langle a_k, x\rangle + z_k,\, 1\leq k \leq m,\]

or, as a matrix equation,

\[ y = Ax + z \]

where \(A\) is the \(m \times d\) stacked measurement matrices, and the \(z\) terms denote i.i.d. measurement noise.

Now, if \(x\) is a sparse vector, and \(A\) satisfies a restricted isometry property or something then we can construct an estimate \(\hat{x}\) with small error by minimising

\[ \hat{x}=\min \|\dot{x}\|_1 \text{ subject to } \|A\dot{x}-y\|_2 < \varepsilon, \]

where \(\varepsilon> \|z\|_2^2.\)

In the lecture notes on restricted isometry properties, Candès and Tao talk about not vectors \(x\in \mathbb{R}^d\) but functions \(f:G \mapsto \mathbb{C}\) on Abelian groups like \(G=\mathbb{Z}/d\mathbb{Z},\) which is convenient for some phrasing, since then when I say my signal is \(s\)-sparse, which means that its support \(\operatorname{supp} \tilde{f}=S\subset G\) where \(|S|=s\).

In the finite-dimensional vector framing, we can talk about best sparse approximations \(x_s\) to non-sparse vectors, \(x\).

\[ x_s = \argmin_{\|\dot{x}\|_0\leq s} \|x-\dot{x}\|_2 \]

where all the coefficients apart from the \(s\) largest are zeroed.

The basic results find attractive convex problems with high probability in a nest of nastier ones. There are also greedy optimisation versions, which are formulated as above, but no longer necessarily a convex optimisation; instead, we talk about Orthogonal Matching Pursuit, Iterative Thresholding and some other stuff the details of which I do not yet know, which I think pops up in wavelets and sparse coding.

For all of these the results tend to be something like

with data \(y,\) the difference between my estimate of \(\hat{x}\) and \(\hat{x}_\text{oracle}\) is bounded by something-or-other where the oracle estimate is the one where you know ahead of time the set \(S=\operatorname{supp}(x)\).

Candés gives an example result

\[ \|\hat{x}-x\|_2 \leq C_0\frac{\|x-x_s\|_1}{\sqrt{s}} + C_1\varepsilon \]

conditional upon

\[ \delta_2s(A) < \sqrt{2} -1 \]

where this \(\delta_s(\cdot)\) gives the restricted isometry constant of a matrix, defined as the smallest constant such that \((1-\delta_s(A))\|x\|_2^2\leq \|Ax\|_2^2\leq (1+\delta_s(A))\|x\|_2^2\) for all \(s\)-sparse \())x\). That is, the measurement matrix does not change the norm of sparse signals “much”, and in particular, does not null them when \(\delta_s < 1.\)

This is not the strongest bound out there apparently, but for any of that form, those constants look frustrating.

Measuring the restricted isometry constant of a given measurement matrix is presumably hard, although I haven’t tried yet. But generating random matrices that have a certain RIC with high probability is easy; that’s a neat trick in this area.

2 Redundant compressed sensing

🏗 For now see Frame theory.

3 Introductory texts

Aside: see the rather good practical python notebook in numerical tours.
Terry Tao’s exposition is great as usual. See, e.g.
- Single pixel cameras
- The uniform uncertainty principle and compressed sensing:
[…] we can at least give an informal geometric argument as to why \(\ell^1\) minimisation is more likely to recover a sparse solution than \(\ell^2\) minimisation. The set of all \(f\) whose Fourier coefficients match the observed data \(c_\xi\) forms an affine subspace of the space of all functions. The \(\ell^2\) minimiser can then be viewed geometrically by taking l^2 balls (i.e. Euclidean balls) centred at the origin, and gradually increasing the radius of the ball until the first point of contact with the affine subspace. In general, there is no reason to expect this point of contact to be sparse (i.e. to lie on a high-codimension coordinate subspace). If however we replace \(\ell^2\) with \(\ell^1\), then the Euclidean balls are replaced by octahedra, which are much “pointier” (especially in high dimensions) and whose corners lie on coordinate subspaces. So the point of first contact is now much more likely to be sparse. The idea of using \(\ell^1\) as a “convex relaxation” of \(\ell^0\) is a powerful one in applied mathematics; see for instance (J. A. Tropp 2006) on the topic.
Hegde, Baraniuk, Davenport and Duarte have an open source textbook
Wes McKinney’s intro
RIP vs JL
Gabriel Peyre’s Compressed Sensing of Images
Igor Carron’s Sunday Morning Insight: A Quick Panorama of Sensing from Direct Imaging to Machine Learning
- More Igor Carron Because This Guy Is Hyperactive But Cut Through the Verbiage Because He Notices Some Cool Stuff
- and his intro to compressive sensing page

4 …Using random projections

Classic. Notes under low dimensional projections

5 …Using deterministic projections

Surely this is close to quasi monte carlo?

Dustin G. Mixon Achieving the Welch bound with difference sets

I blogged about constructing harmonic frames using difference sets. We proved that such harmonic frames are equiangular tight frames, thereby having minimal coherence between columns. I concluded the entry by conjecturing that incoherent harmonic frames are as good for compressed sensing as harmonic frames whose rows were randomly drawn from the discrete Fourier transform (DFT) matrix
A variant on the compressed sensing of Yves Meyer
recent work of Yves Meyer might be relevant:
- A variant on the compressed sensing of Emmanuel Candes, Basarab Matei and Yves Meyer
- Simple quasicrystals are sets of stable sampling, Basarab Matei and Yves Meyer
These papers are interesting because their approach to compressed sensing is very different. Specifically, their sparse vectors are actually functions of compact support with sufficiently small Lebesgue measure. As such, concepts like conditioning are replaced with that of stable sampling, and the results must be interpreted in the context of functional analysis. The papers demonstrate that sampling frequencies according to a (deterministic) simple quasicrystal will uniquely determine sufficiently sparse functions, and furthermore, the sparsest function in the preimage can be recovered by L1-minimization provided it’s nonnegative.

6 Bayesian

Sparse Bayes can be tricky. See, perhaps, Bayesian Compressive Sensing.

7 Phase transitions

How well can you recover a matrix from a certain number of measurements? In obvious metrics there is a sudden jump in how well you do with increasing measurements for a given rank. This looks a lot like a physical phase transition, which is a known phenomenon in ML. Hmm.

8 Weird things to be classified

csgm, (Bora et al. 2017) compressed sensing using generative models, tries to find a model which is sparse with respect to… some manifold of the latent variables of… a generative model? or something?

Sparse FFT.

9 References

Achlioptas. 2003. “Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins.” Journal of Computer and System Sciences, Special Issue on PODS 2001,.

Azizyan, Krishnamurthy, and Singh. 2015. “Extreme Compressive Sampling for Covariance Estimation.” arXiv:1506.00898 [Cs, Math, Stat].

Bach, Jenatton, and Mairal. 2011. Optimization With Sparsity-Inducing Penalties. Foundations and Trends(r) in Machine Learning 1.0.

Baraniuk, Richard G. 2007. “Compressive Sensing.” IEEE Signal Processing Magazine.

———. 2008. “Single-Pixel Imaging via Compressive Sampling.” IEEE Signal Processing Magazine.

Baraniuk, Richard G., Cevher, Duarte, et al. 2010. “Model-Based Compressive Sensing.” IEEE Transactions on Information Theory.

Baraniuk, Richard, Davenport, DeVore, et al. 2008. “A Simple Proof of the Restricted Isometry Property for Random Matrices.” Constructive Approximation.

Baron, Sarvotham, and Baraniuk. 2010. “Bayesian Compressive Sensing via Belief Propagation.” IEEE Transactions on Signal Processing.

Bayati, and Montanari. 2011. “The Dynamics of Message Passing on Dense Graphs, with Applications to Compressed Sensing.” IEEE Transactions on Information Theory.

Berger, Daniels, and Yu. 2016. “Computational Biology in the 21st Century: Scaling with Compressive Algorithms.” Communications of the ACM.

Bian, and Chen. 2013. “Worst-Case Complexity of Smoothing Quadratic Regularization Methods for Non-Lipschitzian Optimization.” SIAM Journal on Optimization.

Bingham, and Mannila. 2001. “Random Projection in Dimensionality Reduction: Applications to Image and Text Data.” In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’01.

Blanchard. 2013. “Toward Deterministic Compressed Sensing.” Proceedings of the National Academy of Sciences.

Bora, Jalal, Price, et al. 2017. “Compressed Sensing Using Generative Models.” In International Conference on Machine Learning.

Borgerding, and Schniter. 2016. “Onsager-Corrected Deep Networks for Sparse Linear Inverse Problems.” arXiv:1612.01183 [Cs, Math].

Bruckstein, Elad, and Zibulevsky. 2008a. “Sparse Non-Negative Solution of a Linear System of Equations Is Unique.” In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008.

———. 2008b. “On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations.” IEEE Transactions on Information Theory.

Cai, Xu, and Zhang. 2008. “On Recovery of Sparse Signals via ℓ1 Minimization.” arXiv:0805.0149 [Cs].

Cai, and Zhang. 2015. “ROP: Matrix Recovery via Rank-One Projections.” The Annals of Statistics.

Candès, Emmanuel J. 2014. “Mathematics of Sparsity (and Few Other Things).” ICM 2014 Proceedings, to Appear.

Candès, Emmanuel J., and Davenport. 2011. “How Well Can We Estimate a Sparse Vector?” arXiv:1104.5246 [Cs, Math, Stat].

Candès, Emmanuel J., Eldar, Needell, et al. 2011. “Compressed Sensing with Coherent and Redundant Dictionaries.” Applied and Computational Harmonic Analysis.

Candès, Emmanuel J., and Recht. 2009. “Exact Matrix Completion via Convex Optimization.” Foundations of Computational Mathematics.

Candès, Emmanuel J., Romberg, and Tao. 2006a. “Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information.” IEEE Transactions on Information Theory.

Candès, Emmanuel J., Romberg, and Tao. 2006b. “Stable Signal Recovery from Incomplete and Inaccurate Measurements.” Communications on Pure and Applied Mathematics.

Candès, Emmanuel, and Tao. 2005. “Decoding by Linear Programming.” IEEE Transactions on Information Theory.

Candès, Emmanuel J., and Tao. 2006. “Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?” IEEE Transactions on Information Theory.

———. 2008. “The Uniform Uncertainty Principle and Compressed Sensing.”

Candès, Emmanuel J., and Wakin. 2008. “An Introduction To Compressive Sampling.” IEEE Signal Processing Magazine.

Carmi. 2013. “Compressive System Identification: Sequential Methods and Entropy Bounds.” Digital Signal Processing.

———. 2014. “Compressive System Identification.” In Compressed Sensing & Sparse Filtering. Signals and Communication Technology.

Cevher, Duarte, Hegde, et al. 2009. “Sparse Signal Recovery Using Markov Random Fields.” In Advances in Neural Information Processing Systems.

Chartrand, and Yin. 2008. “Iteratively Reweighted Algorithms for Compressive Sensing.” In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008. ICASSP 2008.

Chen. 2012. “Smoothing Methods for Nonsmooth, Nonconvex Minimization.” Mathematical Programming.

Chen, and Zhou. 2013. “Convergence of the Reweighted ℓ.” Computational Optimization and Applications.

Chretien. 2008. “An Alternating L1 Approach to the Compressed Sensing Problem.” arXiv:0809.0660 [Stat].

Dasgupta. 2000. “Experiments with Random Projection.” In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence. UAI’00.

Dasgupta, and Gupta. 2003. “An Elementary Proof of a Theorem of Johnson and Lindenstrauss.” Random Structures & Algorithms.

Dasgupta, Hsu, and Verma. 2006. “A Concentration Theorem for Projections.” In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence. UAI’06.

Daubechies, I., Defrise, and De Mol. 2004. “An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint.” Communications on Pure and Applied Mathematics.

Daubechies, Ingrid, DeVore, Fornasier, et al. 2010. “Iteratively Reweighted Least Squares Minimization for Sparse Recovery.” Communications on Pure and Applied Mathematics.

DeVore. 1998. “Nonlinear Approximation.” Acta Numerica.

Diaconis, and Freedman. 1984. “Asymptotics of Graphical Projection Pursuit.” The Annals of Statistics.

Donoho, David L. 2006. “Compressed Sensing.” IEEE Transactions on Information Theory.

Donoho, David L., and Elad. 2003. “Optimally Sparse Representation in General (Nonorthogonal) Dictionaries via ℓ1 Minimization.” Proceedings of the National Academy of Sciences.

Donoho, D. L., Elad, and Temlyakov. 2006. “Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise.” IEEE Transactions on Information Theory.

Donoho, David L., Maleki, and Montanari. 2009a. “Message-Passing Algorithms for Compressed Sensing.” Proceedings of the National Academy of Sciences.

———. 2009b. “Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation.” In 2010 IEEE Information Theory Workshop (ITW).

Donoho, David L., Maleki, and Montanari. 2010. “Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction.” In 2010 IEEE Information Theory Workshop (ITW).

Duarte, and Baraniuk. 2013. “Spectral Compressive Sensing.” Applied and Computational Harmonic Analysis.

Flammia, Gross, Liu, et al. 2012. “Quantum Tomography via Compressed Sensing: Error Bounds, Sample Complexity, and Efficient Estimators.” New Journal of Physics.

Foygel, and Srebro. 2011. “Fast-Rate and Optimistic-Rate Error Bounds for L1-Regularized Regression.” arXiv:1108.0373 [Math, Stat].

Freund, Dasgupta, Kabra, et al. 2007. “Learning the Structure of Manifolds Using Random Projections.” In Advances in Neural Information Processing Systems.

Giryes, Sapiro, and Bronstein. 2016. “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?” IEEE Transactions on Signal Processing.

Graff, and Sidky. 2015. “Compressive Sensing in Medical Imaging.” Applied Optics.

Hall, and Li. 1993. “On Almost Linearity of Low Dimensional Projections from High Dimensional Data.” The Annals of Statistics.

Harchaoui, Juditsky, and Nemirovski. 2015. “Conditional Gradient Algorithms for Norm-Regularized Smooth Convex Optimization.” Mathematical Programming.

Hassanieh, Haitham, Indyk, Katabi, et al. 2012. “Nearly Optimal Sparse Fourier Transform.” In Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing. STOC ’12.

Hassanieh, H., Indyk, Katabi, et al. 2012. “Simple and Practical Algorithm for Sparse Fourier Transform.” In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms. Proceedings.

Hegde, and Baraniuk. 2012. “Signal Recovery on Incoherent Manifolds.” IEEE Transactions on Information Theory.

Hormati, Roy, Lu, et al. 2010. “Distributed Sampling of Signals Linked by Sparse Filtering: Theory and Applications.” IEEE Transactions on Signal Processing.

Hoyer. n.d. “Non-Negative Matrix Factorization with Sparseness Constraints.” Journal of Machine Learning Research.

Jaggi. 2013. “Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization.” In Journal of Machine Learning Research.

Jung, Heckel, Bölcskei, et al. 2013. “Compressive Nonparametric Graphical Model Selection For Time Series.” arXiv:1311.3257 [Stat].

Kabán. 2014. “New Bounds on Compressive Linear Least Squares Regression.” In Journal of Machine Learning Research.

Kim, and Haldar. 2016. “Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery.” Signal Processing.

Lahiri, Gao, and Ganguli. 2016. “Random Projections of Random Manifolds.” arXiv:1607.04331 [Cs, q-Bio, Stat].

Launay, Poli, Boniface, et al. 2020. “Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures.” In Advances in Neural Information Processing Systems.

Li, Ping, Hastie, and Church. 2006. “Very Sparse Random Projections.” In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06.

Li, Yingying, and Osher. 2009. “Coordinate Descent Optimization for ℓ 1 Minimization with Application to Compressed Sensing; a Greedy Algorithm.” Inverse Problems and Imaging.

Matei, and Meyer. 2010. “Simple Quasicrystals Are Sets of Stable Sampling.” Complex Variables and Elliptic Equations.

———. n.d. “A Variant on the Compressed Sensing of Emmanuel Candes.”

Mishali, and Eldar. 2010. “From Theory to Practice: Sub-Nyquist Sampling of Sparse Wideband Analog Signals.” IEEE Journal of Selected Topics in Signal Processing.

Montanari. 2012. “Graphical Models Concepts in Compressed Sensing.” Compressed Sensing: Theory and Applications.

Mousavi, and Baraniuk. 2017. “Learning to Invert: Signal Recovery via Deep Convolutional Networks.” In ICASSP.

Needell, and Tropp. 2008. “CoSaMP: Iterative Signal Recovery from Incomplete and Inaccurate Samples.” arXiv:0803.2392 [Cs, Math].

Oka, and Lampe. 2008. “Compressed Sensing of Gauss-Markov Random Field with Wireless Sensor Networks.” In 5th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2008. SAM 2008.

Olshausen, B. A., and Field. 1996. “Natural image statistics and efficient coding.” Network (Bristol, England).

Olshausen, Bruno A, and Field. 2004. “Sparse Coding of Sensory Inputs.” Current Opinion in Neurobiology.

Oxvig, Arildsen, and Larsen. 2017. “Generalized Approximate Message Passing: Relations and Derivations.”

Pawar, and Ramchandran. 2015. “A Robust Sub-Linear Time R-FFAST Algorithm for Computing a Sparse DFT.” arXiv:1501.00320 [Cs, Math].

Peleg, Eldar, and Elad. 2010. “Exploiting Statistical Dependencies in Sparse Representations for Signal Recovery.” IEEE Transactions on Signal Processing.

Qiuyun Zou, Haochuan Zhang, Chao-Kai Wen, et al. 2018. “Concise Derivation for Generalized Approximate Message Passing Using Expectation Propagation.” IEEE Signal Processing Letters.

Rangan. 2011. “Generalized Approximate Message Passing for Estimation with Random Linear Mixing.” In 2011 IEEE International Symposium on Information Theory Proceedings.

Ravishankar, Saiprasad, and Bresler. 2015. “Efficient Blind Compressed Sensing Using Sparsifying Transforms with Convergence Guarantees and Application to MRI.” arXiv:1501.02923 [Cs, Stat].

Ravishankar, S., and Bresler. 2015. “Sparsifying Transform Learning With Efficient Optimal Updates and Convergence Guarantees.” IEEE Transactions on Signal Processing.

Rish, and Grabarnik. 2014. “Sparse Signal Recovery with Exponential-Family Noise.” In Compressed Sensing & Sparse Filtering. Signals and Communication Technology.

Rish, and Grabarnik. 2015. Sparse Modeling: Theory, Algorithms, and Applications. Chapman & Hall/CRC Machine Learning & Pattern Recognition Series.

Romberg. 2008. “Imaging via Compressive Sampling.” IEEE Signal Processing Magazine.

Rosset, and Zhu. 2007. “Piecewise Linear Regularized Solution Paths.” The Annals of Statistics.

Rubinstein, Peleg, and Elad. 2013. “Analysis K-SVD: A Dictionary-Learning Algorithm for the Analysis Sparse Model.” IEEE Transactions on Signal Processing.

Sarvotham, Baron, and Baraniuk. 2006. “Measurements Vs. Bits: Compressed Sensing Meets Information Theory.” In In Proc. Allerton Conf. On Comm., Control, and Computing.

Schniter, and Rangan. 2012. “Compressive Phase Retrieval via Generalized Approximate Message Passing.” In 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

Shalev-Shwartz, and Tewari. 2011. “Stochastic Methods for L1-Regularized Loss Minimization.” Journal of Machine Learning Research.

Smith, Forte, Jordan, et al. 2015. “L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework.” arXiv:1512.04011 [Cs].

Song, Xie, and Pokutta. 2015. “Sequential Information Guided Sensing.” arXiv:1509.00130 [Cs, Math, Stat].

Tropp, J.A. 2006. “Just Relax: Convex Programming Methods for Identifying Sparse Signals in Noise.” IEEE Transactions on Information Theory.

Tropp, J. A., and Wright. 2010. “Computational Methods for Sparse Solution of Linear Inverse Problems.” Proceedings of the IEEE.

Vetterli. 1999. “Wavelets: Approximation and Compression–a Review.” In AeroSense’99.

Weidmann, and Vetterli. 2012. “Rate Distortion Behavior of Sparse Sources.” IEEE Transactions on Information Theory.

Wipf, and Nagarajan. 2016. “Iterative Reweighted L1 and L2 Methods for Finding Sparse Solution.” Microsoft Research.

Wu, R., Huang, and Chen. 2013. “The Exact Support Recovery of Sparse Signals With Noise via Orthogonal Matching Pursuit.” IEEE Signal Processing Letters.

Wu, Yan, Rosca, and Lillicrap. 2019. “Deep Compressed Sensing.” In International Conference on Machine Learning.

Yaghoobi, Nam, Gribonval, et al. 2012. “Noise Aware Analysis Operator Learning for Approximately Cosparse Signals.” In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

Yang, and Xu. 2015. “Streaming Sparse Principal Component Analysis.” In Journal of Machine Learning Research.

Zhang, Liu, Zhang, et al. 2017. “Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling.” In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17.