See also matrix factorisations, and discuss random projections and their role in motivating compressed sensing etc.

Cover's Theorem (Cove65)

It was shown that, for a random set of linear inequalities in unknowns, the expected number of extreme inequalities, which are necessary and sufficient to imply the entire set, tends to as the number of consistent inequalities tends to infinity, thus bounding the expected necessary storage capacity for linear decision algorithms in separable problems. The results, even those dealing with randomly positioned points, have been combinatorial in nature, and have been essentially independent of the configuration of the set of points in the space.

I am especially interested in random embeddings for kernel approximation.

Over at compressed sensing we mention some other random projection results, such as the Johnson-Lindenstrauss lemma, and these ideas are closely related, in the probabilistic setting, to concentration inequalities.

Landweber etc al (LaLP16) have an example of a converse result about your continuous random embeddings. (TBD)

## Refs

- TASL14: Minh Tang, Avanti Athreya, Daniel L. Sussman, Vince Lyzinski, Carey E. Priebe (2014) A nonparametric two-sample hypothesis testing problem for random dot product graphs.
*ArXiv:1409.2344 [Math, Stat]*. - BDDW08: Richard Baraniuk, Mark Davenport, Ronald DeVore, Michael Wakin (2008) A Simple Proof of the Restricted Isometry Property for Random Matrices.
*Constructive Approximation*, 28(3), 253–263. DOI - Fodo02: Imola Fodor (2002) A Survey of Dimension Reduction Techniques
- DaGu03: Sanjoy Dasgupta, Anupam Gupta (2003) An elementary proof of a theorem of Johnson and Lindenstrauss.
*Random Structures & Algorithms*, 22(1), 60–65. DOI - CaRS08: M. Casey, C. Rhodes, M. Slaney (2008) Analysis of Minimum Distances in High-Dimensional Musical Spaces.
*IEEE Transactions on Audio, Speech, and Language Processing*, 16(5), 1015–1028. DOI - AINR13: Alexandr Andoni, Piotr Indyk, Huy L. Nguyen, Ilya Razenshteyn (2013) Beyond Locality-Sensitive Hashing.
*ArXiv:1306.1547 [Cs]*. - AuVi15: Alex Auvolat, Pascal Vincent (2015) Clustering is Efficient for Approximate Maximum Inner Product Search.
*ArXiv:1507.05910 [Cs, Stat]*. - McBB13: Brian McWilliams, David Balduzzi, Joachim M Buhmann (2013) Correlated random features for fast semi-supervised learning. In Advances in Neural Information Processing Systems 26 (Vol. 1050, pp. 440–448). Curran Associates, Inc.
- Achl03: Dimitris Achlioptas (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins.
*Journal of Computer and System Sciences*, 66(4), 671–687. DOI - GiSB16: R. Giryes, G. Sapiro, A. M. Bronstein (2016) Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?
*IEEE Transactions on Signal Processing*, 64(13), 3444–3457. DOI - OAZJ16: Meshia Cédric Oveneke, Mitchel Aliosha-Perez, Yong Zhao, Dongmei Jiang, Hichem Sahli (2016) Efficient Convolutional Auto-Encoding via Random Convexification and Frequency-Domain Minimization. In Advances in Neural Information Processing Systems 29.
- Dasg00: Sanjoy Dasgupta (2000) Experiments with Random Projection. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (pp. 143–151). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
- GeEW06: Pierre Geurts, Damien Ernst, Louis Wehenkel (2006) Extremely randomized trees.
*Machine Learning*, 63(1), 3–42. DOI - MoTJ06: Frank Moosmann, Bill Triggs, Frederic Jurie (2006) Fast Discriminative Visual Codebooks using Randomized Clustering Forests. In Advances in Neural Information Processing Systems (pp. 985–992).
- AlMa14: Ahmed El Alaoui, Michael W. Mahoney (2014) Fast Randomized Kernel Methods With Statistical Guarantees.
*ArXiv:1411.0306 [Cs, Stat]*. - WDLS09: Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, Josh Attenberg (2009) Feature Hashing for Large Scale Multitask Learning. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 1113–1120). New York, NY, USA: ACM DOI
- Cove65: T. M. Cover (1965) Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition.
*IEEE Transactions on Electronic Computers*, EC-14(3), 326–334. DOI - HZOM17: Andrew C. Heusser, Kirsten Ziman, Lucy L. W. Owen, Jeremy R. Manning (2017) HyperTools: A Python toolbox for visualizing and manipulating high-dimensional data.
*ArXiv:1701.08290 [Stat]*. - FDKV07: Yoav Freund, Sanjoy Dasgupta, Mayank Kabra, Nakul Verma (2007) Learning the structure of manifolds using random projections. In Advances in Neural Information Processing Systems (pp. 473–480).
- DIIM04: Mayur Datar, Nicole Immorlica, Piotr Indyk, Vahab S. Mirrokni (2004) Locality-sensitive Hashing Scheme Based on P-stable Distributions. In Proceedings of the Twentieth Annual Symposium on Computational Geometry (pp. 253–262). New York, NY, USA: ACM DOI
- AnIn06: A. Andoni, P. Indyk (2006) Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. In 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. FOCS ’06 (Vol. 51, pp. 459–468). DOI
- CaTa06: Emmanuel J. Candès, Terence Tao (2006) Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
*IEEE Transactions on Information Theory*, 52(12), 5406–5425. DOI - HaLi93: Peter Hall, Ker-Chau Li (1993) On almost Linearity of Low Dimensional Projections from High Dimensional Data.
*The Annals of Statistics*, 21(2), 867–889. - LaLP16: Peter S. Landweber, Emanuel A. Lazar, Neel Patel (2016) On Fiber Diameters of Continuous Maps.
*American Mathematical Monthly*, 123(4), 392–397. DOI - Bach15: Francis Bach (2015) On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions.
*ArXiv Preprint ArXiv:1502.06800*. - AnRa15: Alexandr Andoni, Ilya Razenshteyn (2015) Optimal Data-Dependent Hashing for Approximate Near Neighbors.
*ArXiv:1501.01062 [Cs]*. - KWSR16: Alec Koppel, Garrett Warnell, Ethan Stump, Alejandro Ribeiro (2016) Parsimonious Online Learning with Kernels via Sparse Projections in Function Space.
*ArXiv:1612.04111 [Cs, Stat]*. - BrBH16: Romain Brault, Florence d’Alché-Buc, Markus Heinonen (2016) Random Fourier features for operator-valued kernels. In Proceedings of The 8th Asian Conference on Machine Learning (pp. 110–125).
- BiMa01: Ella Bingham, Heikki Mannila (2001) Random Projection in Dimensionality Reduction: Applications to Image and Text Data. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 245–250). New York, NY, USA: ACM DOI
- ScWa17: Simone Scardapane, Dianhui Wang (2017) Randomness in neural networks: an overview.
*Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery*, 7(2). DOI - KMKB16: Gabriel Krummenacher, Brian McWilliams, Yannic Kilcher, Joachim M Buhmann, Nicolai Meinshausen (2016) Scalable Adaptive Stochastic Optimization Using Random Projections. In Advances in Neural Information Processing Systems 29 (pp. 1750–1758). Curran Associates, Inc.
- DeBo15: Amir Dezfouli, Edwin V. Bonilla (2015) Scalable Inference for Gaussian Process Models with Black-box Likelihoods. In Advances in Neural Information Processing Systems 28 (pp. 1414–1422). Cambridge, MA, USA: MIT Press
- ZWCL10: Dell Zhang, Jun Wang, Deng Cai, Jinsong Lu (2010) Self-taught Hashing for Fast Similarity Search. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 18–25). New York, NY, USA: ACM DOI
- GiIM99: Aristides Gionis, Piotr Indyky, Rajeev Motwaniz (1999) Similarity Search in High Dimensions via Hashing.
- KaNe14: Daniel M. Kane, Jelani Nelson (2014) Sparser Johnson-Lindenstrauss Transforms.
*Journal of the ACM*, 61(1), 1–23. DOI - DuBa13: Marco F. Duarte, Richard G. Baraniuk (2013) Spectral compressive sensing.
*Applied and Computational Harmonic Analysis*, 35(1), 111–129. DOI - EYWR16: Armin Eftekhari, Han Lun Yap, Michael B. Wakin, Christopher J. Rozell (2016) Stabilizing Embedology: Geometry-Preserving Delay-Coordinate Maps.
*ArXiv:1609.06347 [Nlin, Stat]*. - GoTR16: Alexander N. Gorban, Ivan Yu Tyukin, Ilya Romanenko (2016) The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit.
*ArXiv:1610.00494 [Cs, Stat]*. - AiCh09: Nir Ailon, Bernard Chazelle (2009) The Fast Johnson–Lindenstrauss Transform and Approximate Nearest Neighbors.
*SIAM Journal on Computing*, 39(1), 302–322. DOI - ChRW17: Krzysztof Choromanski, Mark Rowland, Adrian Weller (2017) The Unreasonable Effectiveness of Random Orthogonal Embeddings.
*ArXiv:1703.00864 [Stat]*. - OyTr15: Samet Oymak, Joel A. Tropp (2015) Universality laws for randomized dimension reduction, with applications.
*ArXiv:1511.09433 [Cs, Math, Stat]*. - LiHC06: Ping Li, Trevor J. Hastie, Kenneth W. Church (2006) Very Sparse Random Projections. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 287–296). New York, NY, USA: ACM DOI