# The Living Thing / Notebooks : Sparse basis dictionary expansions

Regression with dictionaries of basis functions, with respect to which you expect your representation to be sparse; i.e. in the statistical case, basis-sparse regression. But even outside statistics, you wish simply to approximate some data compactly. My focus here is on the noisy-observation case, although the same results are recycled enough throughout the field.

Orthogonal matching pursuit, wavelet, curvelets, chirplets, framelets, shearlets, whateverlets, content-specific basis dictionaries, designed or learned. Mammals visual cortexes seem to do some thing like this.

If they are not intended to be sparsely coded, it’s probably a plain old linear basis expansion or a relative.

To discuss:

• connection to mixture models.
• Sampling complexity versus approximation complexity
• am especially interested in approaches where we learn the dictionary basis unsupervised (see MBPS09 for the recent innovations there)

## Shift-invariant bases

I would like to find a general way of doing this in a phase/shift-robust fashion, although doing this naively can be computationally expensive outside of certain convenient bases. It can be better if you know you have have constraints, say, a convex combination (e.g. mixtures) or positive definite bases. (e.g. in kernel methods)

One method is “Shift Invariant Sparse coding”, ( BlDa04) and there are various versions out there. (GRKN07 One way is to include multiple shifted copies of your atoms, another is to actually shift them in a separate optimisation stage. Both these get annoying in the time domain for various reasons.

Affine tight framelets (DHRS03) and their less computationally tractable, more flexible cousins, shearlets also sound interesting here. For reasons I do not yet understand I am told they can naturally be used on graphs and manifolds, not just lattices. I just saw Xiaosheng Zhuang present these (e.g. HaZZ16 and WaZh16, where WaZh16 demonstrates a “Fast Framelet Transform” which is computationally as attractive as the FFT, supposedly.)

I have some ideas I call learning gamelan.

## Implementations

This boils down to clever optimisation to make the calculations tractable.

• Theano example of dictionary learning by Daniel LaCombe

• all the wavelet toolkits do this.

• Sparse-filtering: Unsupervised feature learning based on sparse-filtering

This implements the method described Jiquan Ngiam, Pang Wei Koh, Zhenghao Chen, Sonia Bhaskar, Andrew Y. Ng: Sparse Filtering. NIPS 2011: 1125-1133 and is based on the Matlab code provided in the supplementary material

• spams does this (see optimisation)

AGMM15
Arora, S., Ge, R., Ma, T., & Moitra, A. (2015) Simple, Efficient, and Neural Algorithms for Sparse Coding. arXiv:1503.00778 [Cs, Stat].
BaJo06
Bach, F. R., & Jordan, M. I.(2006) Learning spectral clustering, with application to speech separation. Journal of Machine Learning Research, 7(Oct), 1963–2001.
BCDH10
Baraniuk, R. G., Cevher, V., Duarte, M. F., & Hegde, C. (2010) Model-based compressive sensing. IEEE Transactions on Information Theory, 56(4), 1982–2001. DOI.
BLMM12
Barthélemy, Q., Larue, A., Mayoue, A., Mercier, D., & Mars, J. I.(2012) Shift & 2D rotation invariant sparse coding for multivariate signals. IEEE Transactions on Signal Processing, 60(4), 1597–1611.
BePR11
Bertin, K., Pennec, E. L., & Rivoirard, V. (2011) Adaptive Dantzig density estimation. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 47(1), 43–74. DOI.
BlDa04
Blumensath, T., & Davies, M. (2004) On Shift-Invariant Sparse Coding. In C. G. Puntonet & A. Prieto (Eds.), Independent Component Analysis and Blind Signal Separation (Vol. 3195, pp. 1205–1212). Berlin, Heidelberg: Springer Berlin Heidelberg
BlDa06
Blumensath, T., & Davies, M. (2006) Sparse and shift-Invariant representations of music. IEEE Transactions on Audio, Speech and Language Processing, 14(1), 50–57. DOI.
Boye11
Boyes, G. (2011) Dictionary-Based Analysis/Synthesis and Structured Representations of Musical Audio. . McGill University
BrEZ08
Bruckstein, A. M., Elad, M., & Zibulevsky, M. (2008) On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations. IEEE Transactions on Information Theory, 54(11), 4813–4820. DOI.
CaCS08
Cai, J.-F., Chan, R. H., & Shen, Z. (2008) A framelet-based image inpainting algorithm. Applied and Computational Harmonic Analysis, 24(2), 131–149. DOI.
CVVR11
Carabias-Orti, J. J., Virtanen, T., Vera-Candeas, P., Ruiz-Reyes, N., & Canadas-Quesada, F. J.(2011) Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization. IEEE Journal of Selected Topics in Signal Processing, 5(6), 1144–1158. DOI.
ChDo94
Chen, S., & Donoho, D. L.(1994) Basis pursuit. In 1994 Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, 1994 (Vol. 1, pp. 41–44 vol.1). DOI.
Daubechies, I., Defrise, M., & De Mol, C. (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics, 57(11), 1413–1457. DOI.
DHRS03
Daubechies, Ingrid, Han, B., Ron, A., & Shen, Z. (2003) Framelets: MRA-based constructions of wavelet frames. Applied and Computational Harmonic Analysis, 14(1), 1–46. DOI.
Davi98
Davis, G. M.(1998) A wavelet-based analysis of fractal image compression. IEEE Transactions on Image Processing, 7(2), 141–154. DOI.
DaMZ94
Davis, G. M., Mallat, S. G., & Zhang, Z. (1994) Adaptive time-frequency decompositions. Optical Engineering, 33(7), 2183–2191. DOI.
Devo98
DeVore, R. A.(1998) Nonlinear approximation. Acta Numerica, 7, 51–150. DOI.
Dong00
Dong, B. (n.d.) Sparse representation on graphs by tight wavelet frames and applications. Applied and Computational Harmonic Analysis. DOI.
Dono06
Donoho, D. L.(2006) Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306. DOI.
DoJo95
Donoho, D. L., & Johnstone, I. M.(1995) Adapting to Unknown Smoothness via Wavelet Shrinkage. Journal of the American Statistical Association, 90(432), 1200–1224. DOI.
DJKP95
Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., & Picard, D. (1995) Wavelet Shrinkage: Asymptopia?. Journal of the Royal Statistical Society. Series B (Methodological), 57(2), 301–369.
DuKL06
Du, P., Kibbe, W. A., & Lin, S. M.(2006) Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics, 22(17), 2059–2065. DOI.
EkTS11
Ekanadham, C., Tranchina, D., & Simoncelli, E. P.(2011) Recovery of Sparse Translation-Invariant Signals With Continuous Basis Pursuit. IEEE Transactions on Signal Processing, 59(10), 4735–4744. DOI.
FaLi01
Fan, J., & Li, R. (2001) Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. Journal of the American Statistical Association, 96(456), 1348–1360. DOI.
GRCL17
Garg, S., Rish, I., Cecchi, G., & Lozano, A. (2017) Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing World. arXiv:1701.06106 [Cs, Stat].
GiNi09
Giné, E., & Nickl, R. (2009) Uniform limit theorems for wavelet density estimators. The Annals of Probability, 37(4), 1605–1646. DOI.
GiSB16
Giryes, R., Sapiro, G., & Bronstein, A. M.(2016) Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?. IEEE Transactions on Signal Processing, 64(13), 3444–3457. DOI.
Good97
Goodwin, M. M.(1997) Adaptive Signal Models: Theory, Algorithms, and Audio Applications (phdthesis). . University of California, Berkeley, Berkeley, CA
Good01
Goodwin, M. M.(2001) Multiscale overlap-add sinusoidal modeling using matching pursuit and refinements. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
GoVe99
Goodwin, M. M., & Vetterli, M. (1999) Matching pursuit and atomic signal models based on recursive filter banks. IEEE Trans. Signal Process., 47(7), 1890–1902. DOI.
GoVe97
Goodwin, M., & Vetterli, M. (1997) Atomic decompositions of audio signals. In 1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, 1997 (p. 4 pp.-). DOI.
GrLe10
Gregor, K., & LeCun, Y. (2010) Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 399–406).
GrLe11
Gregor, K., & LeCun, Y. (2011) Efficient Learning of Sparse Invariant Representations. arXiv:1105.5307 [Cs].
GRKN07
Grosse, R., Raina, R., Kwong, H., & Ng, A. Y.(2007) Shift-Invariant Sparse Coding for Audio Classification. In The Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007) (Vol. 9, p. 8).
HaZZ16
Han, B., Zhao, Z., & Zhuang, X. (2016) Directional tensor product complex tight framelets with low redundancy. Applied and Computational Harmonic Analysis, 41(2), 603–637. DOI.
HaZh15
Han, B., & Zhuang, X. (2015) Smooth affine shear tight frames with MRA structure. Applied and Computational Harmonic Analysis, 39(2), 300–338. DOI.
HaSh00
Han, K., & Shin, H. (n.d.) Functional Linear Regression for Functional Response via Sparse Basis Selection.
HaRR15
Hansen, N. R., Reynaud-Bouret, P., & Rivoirard, V. (2015) Lasso and probabilistic inequalities for multivariate point processes. Bernoulli, 21(1), 83–143. DOI.
HaSG06
Harte, C., Sandler, M., & Gasser, M. (2006) Detecting Harmonic Change in Musical Audio. In Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia (pp. 21–26). New York, NY, USA: ACM DOI.
HJKL00
Henaff, M., Jarrett, K., Kavukcuoglu, K., & LeCun, Y. (n.d.) Unsupervised learning of sparse features for scalable audio classification.
HuCB08
Huang, C., Cheang, G. L. H., & Barron, A. R.(2008) Risk of penalized least squares, greedy selection and l1 penalization for flexible function libraries.
HyHo00
Hyvärinen, A., & Hoyer, P. (2000) Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces. Neural Computation, 12(7), 1705–1720. DOI.
JaPl11
Jafari, M. G., & Plumbley, M. D.(2011) Fast Dictionary Learning for Sparse Representations of Speech Signals. IEEE Journal of Selected Topics in Signal Processing, 5(5), 1025–1031. DOI.
JGPZ10
Jaillet, F., Gribonval, R., Plumbley, M. D., & Zayyani, H. (2010) An L1 criterion for dictionary learning by subspace identification. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 5482–5485). DOI.
Jung13
Jung, A. (2013) An RKHS Approach to Estimation with Sparsity Constraints. In Advances in Neural Information Processing Systems 29.
KoCo16
Koch, P., & Corso, J. J.(2016) Sparse Factorization Layers for Neural Networks with Limited Supervision. arXiv:1612.04468 [Cs, Stat].
KWSR16
Koppel, A., Warnell, G., Stump, E., & Ribeiro, A. (2016) Parsimonious Online Learning with Kernels via Sparse Projections in Function Space. arXiv:1612.04111 [Cs, Stat].
LBRN07
Lee, H., Battle, A., Raina, R., & Ng, A. Y.(2007) Efficient sparse coding algorithms. Advances in Neural Information Processing Systems, 19, 801.
LeSe99
Lewicki, M S, & Sejnowski, T. J.(1999) Coding time-varying signals using sparse, shift-invariant representations. In NIPS (Vol. 11, pp. 730–736). Denver, CO: MIT Press
LeSe00
Lewicki, Michael S., & Sejnowski, T. J.(2000) Learning Overcomplete Representations. Neural Computation, 12(2), 337–365. DOI.
LiTX16
Liu, T., Tao, D., & Xu, D. (2016) Dimensionality-Dependent Generalization Bounds for \$k\$-Dimensional Coding Schemes. arXiv:1601.00238 [Cs, Stat].
MGVB11
Mailhé, B., Gribonval, R., Vandergheynst, P., & Bimbot, F. (2011) Fast orthogonal sparse approximation algorithms over local dictionaries. Signal Processing, 91(12), 2822–2835. DOI.
MaBP14
Mairal, J., Bach, F., & Ponce, J. (2014) Sparse Modeling for Image and Vision Processing. Foundations and Trends® in Comput Graph. Vis., 8(2–3), 85–283. DOI.
MBPS09
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2009) Online Dictionary Learning for Sparse Coding. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 689–696). New York, NY, USA: ACM DOI.
MBPS10
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010) Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11, 19–60.
MaZh93
Mallat, S. G., & Zhang, Z. (1993) Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing, 41(12), 3397–3415.
MaZh92
Mallat, S., & Zhang, Z. (1992) Adaptive time-frequency decomposition with matching pursuits. In Time-Frequency and Time-Scale Analysis, 1992., Proceedings of the IEEE-SP International Symposium (pp. 7–10). DOI.
MaMD14
Marcus, G., Marblestone, A., & Dean, T. (2014) The atoms of neural computation. Science (New York, N.Y.), 346(6209), 551–552. DOI.
Mlyn13
Mlynarski, W. (2013) Sparse, complex-valued representations of natural sounds learned with phase and amplitude continuity priors. arXiv Preprint arXiv:1312.4695.
MoPe10
Mondal, D., & Percival, D. B.(2010) M-estimation of wavelet variance. Annals of the Institute of Statistical Mathematics, 64(1), 27–53. DOI.
MøSH07
Mørup, M., Schmidt, M. N., & Hansen, L. K.(2007) Shift invariant sparse coding of image and music data. Journal of Machine Learning Research.
OlFi96
Olshausen, B. A., & Field, D. J.(1996) Natural image statistics and efficient coding. Network (Bristol, England), 7(2), 333–339. DOI.
OlFi04
Olshausen, Bruno A, & Field, D. J.(2004) Sparse coding of sensory inputs. Current Opinion in Neurobiology, 14(4), 481–487. DOI.
OpWY01
Opsomer, J., Wang, Y., & Yang, Y. (2001) Nonparametric Regression with Correlated Errors. Statistical Science, 16(2), 134-153.
PABD06
Plumbley, M. D., Abdallah, S. A., Blumensath, T., & Davies, M. E.(2006) Sparse representations of polyphonic music. Signal Processing, 86(3), 417–431. DOI.
QiCh94
Qian, S., & Chen, D. (1994) Signal representation using adaptive normalized Gaussian functions. Signal Processing, 36(1), 1–11. DOI.
RaBr15
Ravishankar, S., & Bresler, Y. (2015) Efficient Blind Compressed Sensing Using Sparsifying Transforms with Convergence Guarantees and Application to MRI. arXiv:1501.02923 [Cs, Stat].
RuBE10
Rubinstein, R., Bruckstein, A. M., & Elad, M. (2010) Dictionaries for Sparse Representation Modeling. Proceedings of the IEEE, 98(6), 1045–1057. DOI.
RuZE08
Rubinstein, R., Zibulevsky, M., & Elad, M. (2008) Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit. CS Technion, 40.
SiOl01
Simoncelli, E. P., & Olshausen, B. A.(2001) Natural Image Statistics and Neural Representation. Annual Review of Neuroscience, 24(1), 1193–1216. DOI.
SmLe06
Smith, E. C., & Lewicki, M. S.(2006) Efficient auditory coding. Nature, 439(7079), 978–982. DOI.
Smoo00
Smooth affine shear tight frames: digitization and applications. (n.d.) http://personal.cityu.edu.hk/~xzhuang7/pubs/Conf-2015-Z.pdf.
SoCh17
Soh, Y. S., & Chandrasekaran, V. (2017) A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers. arXiv:1701.01207 [Cs, Math, Stat].
ToCo98
Torrence, C., & Compo, G. P.(1998) A Practical Guide to Wavelet Analysis. Bulletin of the American Meteorological Society, 79(1), 61–78.
ToFr11
Tošić, I., & Frossard, P. (2011) Dictionary learning: What is the right representation for my signal?. IEEE Signal Processing Magazine, 28(2), 27–38. DOI.
TsDo06a
Tsaig, Y., & Donoho, D. L.(2006a) Breakdown of equivalence between the minimal -norm solution and the sparsest solution. Signal Processing, 86(3), 533–548. DOI.
TsDo06b
Tsaig, Y., & Donoho, D. L.(2006b) Extensions of compressed sensing. Signal Processing, 86(3), 549–571. DOI.
VaMB11
Vainsencher, D., Mannor, S., & Bruckstein, A. M.(2011) The Sample Complexity of Dictionary Learning. Journal of Machine Learning Research, 12(Nov), 3259–3281.
Vett99
Vetterli, M. (1999) Wavelets: approximation and compression–a review. In AeroSense’99 (Vol. 3723, pp. 28–31). International Society for Optics and Photonics DOI.
WaZh17
Wang, Y. G., & Zhu, H. (2017) Localized Tight Frames and Fast Framelet Transforms on the Simplex. arXiv:1701.01595 [Cs, Math].
WaZh16
Wang, Y. G., & Zhuang, X. (2016) Tight framelets and fast framelet transforms on manifolds. arXiv:1608.04026 [Math].
WaST14
Wang, Y.-X., Smola, A., & Tibshirani, R. J.(2014) The Falling Factorial Basis and Its Statistical Applications. arXiv:1405.0558 [Stat].
WeVe12
Weidmann, C., & Vetterli, M. (2012) Rate Distortion Behavior of Sparse Sources. IEEE Transactions on Information Theory, 58(8), 4969–4992. DOI.
YNGD13
Yaghoobi, M., Nam, S., Gribonval, R., & Davies, M. E.(2013) Constrained Overcomplete Analysis Operator Learning for Cosparse Signal Modelling. IEEE Transactions on Signal Processing, 61(9), 2341–2355. DOI.
YaHY09
Yang, C., He, Z., & Yu, W. (2009) Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinformatics, 10, 4. DOI.
Zhua15
Zhuang, X. (2015) Smooth affine shear tight frames: digitization and applications. In SPIE Optical Engineering+ Applications (Vol. 9597, pp. 959709–959709). International Society for Optics and Photonics
Zhua16
Zhuang, X. (2016) Digital Affine Shear Transforms: Fast Realization and Applications in Image/Video Processing. SIAM Journal on Imaging Sciences, 9(3), 1437–1466. DOI.