Practical tips, tricks and algorithms.
See also artificial neural network, Markov random fields, synestizer and random forests.
Christian Perone, Convolutional hypercolumns in Python:
Many algorithms using features from CNNs (Convolutional Neural Networks) usually use the last FC (fully-connected) layer features in order to extract information about certain input. However, the information in the last FC layer may be too coarse spatially to allow precise localization (due to sequences of maxpooling, etc.), on the other side, the first layers may be spatially precise but will lack semantic information. To get the best of both worlds, the authors of the hypercolumn paper define the hypercolumn of a pixel as the vector of activations of all CNN units “above” that pixel.
In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time. We show that our model, which profits from combining memory-less modules, namely autoregressive multilayer perceptrons, and stateful recurrent neural networks in a hierarchical structure is able to capture underlying sources of variations in the temporal sequences over very long time spans, on three datasets of different nature. Human evaluation on the generated samples indicate that our model is preferred over competing models. We also show how each component of the model contributes to the exhibited performance.
awesome computer vision is an online list of CV resources far more comprehensive than mine.
scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers.
opensift implements some SiFT variants
Mahotas: Computer Vision in Python a library of fast computer vision algorithms (all implemented in C++) operates over numpy arrays for convenience.
ilastik (also python)
the interactive learning and segmentation toolkit
ilastik is a simple, user-friendly tool for interactive image classification, segmentation and analysis. It is built as a modular software framework, which currently has workflows for automated (supervised) pixel- and object-level classification, automated and semi-automated object tracking, semi-automated segmentation and object counting without detection. Most analysis operations are performed lazily, which enables targeted interactive processing of data subvolumes, followed by complete volume analysis in offline batch mode. Using it requires no experience in image processing.
openCV is released under a BSD license and hence it’s free for both academic and commercial use. It has C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform. Adopted all around the world, OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 9 million. Usage ranges from interactive art, to mines inspection, stitching maps on the web or through advanced robotics.
simpleCV is an open source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries such as OpenCV – without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage. This is computer vision made easy.
- Barron, J. L., Fleet, D. J., & Beauchemin, S. S.(1994) Performance of optical flow techniques. International Journal of Computer Vision, 12(1), 43–77. DOI.
- Fleet, D. J., & Weiss, Y. (2006) Optical Flow Estimation. In N. Paragios, Y. Chen, & O. Faugeras (Eds.), Handbook of mathematical models in computer vision. New York: Springer
- Glocker, B., Komodakis, N., Tziritas, G., Navab, N., & Paragios, N. (2008) Dense image registration through MRFs and efficient linear programmingq. Medical Image Analysis, 12(6), 731–741. DOI.
- Glocker, B., Sotiras, A., Komodakis, N., & Paragios, N. (2011) Deformable Medical Image Registration: Setting the State of the Art with Discrete Methods. Annual Review of Biomedical Engineering, 13(1), 219–244. DOI.
- Kawamoto, K. (2007) Optical Flow–Driven Motion Model with Automatic Variance Adjustment for Adaptive Tracking. In Y. Yagi, S. B. Kang, I. S. Kweon, & H. Zha (Eds.), Computer Vision – ACCV 2007 (pp. 555–564). Springer Berlin Heidelberg DOI.
- Khan, Z., Balch, T., & Dellaert, F. (2004) An MCMC-Based Particle Filter for Tracking Multiple Interacting Targets. In T. Pajdla & J. Matas (Eds.), Computer Vision - ECCV 2004 (pp. 279–290). Springer Berlin Heidelberg DOI.
- Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., & Bottou, L. (2016) Discovering Causal Signals in Images. arXiv:1605.08179 [Cs, Stat].
- Mehri, S., Kumar, K., Gulrajani, I., Kumar, R., Jain, S., Sotelo, J., … Bengio, Y. (2016) SampleRNN: An Unconditional End-to-End Neural Audio Generation Model. arXiv:1612.07837 [Cs].
- Meinhardt-Llopis, E., Sánchez Pérez, J., & Kondermann, D. (2013) Horn-Schunck Optical Flow with a Multi-Scale Strategy. Image Processing On Line, 3, 151–172. DOI.
- Ning, F. (2005) Toward automatic phenotyping of developing embryos from videos. IEEE Trans. Image Process., 14, 1360–1371. DOI.
- Noyer, J. C., Lanvin, P., & Benjelloun, M. (2004) Model-based tracking of 3D objects based on a sequential Monte-Carlo method. In Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004 (Vol. 2, p. 1744–1748 Vol.2). DOI.
- Nummiaro, K., Koller-Meierb, E., & Van Gool, L. (2003) An adaptive color-based particle filter. Image and Vision Computing, 21(1), 99–110.
- Sánchez Pérez, J., Monzón López, N., & Salgado de la Nuez, A. (2013) Robust Optical Flow Estimation. Image Processing On Line, 3, 252–270. DOI.
- Wiatowski, T., & Bölcskei, H. (2015) A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction. arXiv:1512.06293 [Cs, Math, Stat].