# Correlograms

### Also covariances

This material is revised and expanded from the appendix of draft versions of a recent conference submission, for my own reference. I used correlograms a lot in that, but it was startling that despite being simple and, to my mind, non-controversial, it is hard to find a decent summary of their properties anywhere.

Credit to Ning Ma for the following image:


Consider an $$L_2$$ signal $$f: \bb{R}\to\bb{R}.$$ We will frequently overload notation and refer to as signal with free argument $$t$$, so that $$f(rt-\xi),$$ for example, refers to the signal $$t\mapsto f(rt-\xi).$$ We write the inner product between signals $$t\mapsto f(t)$$ and $$t\mapsto f'(t)$$ as $$\inner{f(t)}{f'(t)}$$. Where it is not clear that the free argument is, e.g. $$t$$, we will annotate it $$\finner{f(t)}{f'(t)}{t}$$.

The autocorrelation transform or correlogram $$\cc{A}:L_2(\bb{R}) \to L_2(\bb{R})$$ maps signals to signals. Specifically, $$\mathcal{A}\{f\}$$ is a signal $$\bb{R}\to\bb{R}$$ such that $\mathcal{A}\{f\}:=\xi \mapsto \finner{ f(t) }{ f(t-\xi) }{t}$ This is the covariance between $$f(t)$$ and $$f(t-\xi).$$ Note that we here discuss the covariance between given deterministic signals, not between two stochastic sources; inferring the covariance of stochastic processes is a different problem. Note also that this is what I would call an autocovariance not an auto-correlation, since it’s not normalized. We seem to be stuck with this unfortunate terminology for historical reasons.

We derive the properties of this transform.

Multiplication by a constant. Consider a constant $$c\in \bb{R}.$$ \begin{aligned}\mathcal{A}\{cf\}(\xi)&= \inner{ cf(t) }{ cf(t-\xi) }\\ &= c^2\finner{ f(t) }{ f(t-\xi) }{t}\\ &= c^2\mathcal{A}\{f\}(\xi).\\ \end{aligned}

Time scaling: \begin{aligned}\mathcal{A}\{f(r t)\}(\xi) &=\finner{ f(r t) }{ f(r t-\xi) }{t}\\ &= \int f(r t)f(r t-\xi)\dd t\\ &= \frac{1}{r }\int f(t)f(t-\frac{\xi}{r})\dd t\\ &= \frac{1}{r} \mathcal{A}\{f\}\left(\frac{\xi}{r}\right)\\ \end{aligned}

Addition: \begin{aligned}\mathcal{A}\{f+f'\}(\xi) &=\finner{ f(t)+f'(t) }{ f(t-\xi)+f'(t-\xi) }{t}\\ &=\finner{ f(t) }{ f(t-\xi)\rangle+\langle f(t),f'(t-\xi) }{t} +\finner{ f'(t) }{ f(t-\xi)\rangle+\langle f'(t),f'(t-\xi) }{t}\\ &= \mathcal{A}\{f\}(\xi)+ \finner{ f'(t) }{ f(t-\xi)}{t} +\finner{f(t)}{f'(t-\xi) }{t} +\mathcal{A}\{f'\}(\xi).\\ &= \mathcal{A}\{f\}(\xi)+ \finner{ f'(t) }{ f(t-\xi)}{t} +\finner{f(t+\xi)}{f'(t) }{t} +\mathcal{A}\{f'\}(\xi).\\ &= \mathcal{A}\{f\}(\xi)+ \finner{ f'(t) }{ f(t-\xi)}{t} +\finner{f'(t) }{f(t+\xi)}{t} +\mathcal{A}\{f'\}(\xi).\\ \end{aligned}

We can say little about the term $$\finner{ f'(t) }{ f(t-\xi)}+\finner{f'(t) }{f(t+\xi)}{t}$$ without more information about the signals in question. However, we can solve a randomized version. Suppose $$S_i, \, i\in\bb{N}$$ are i.i.d. Rademacher variables, i.e. that they assume a value in $$\{+1,-1\}$$ with equal probability. Then, we can introduce the following property:

Randomised addition: \begin{aligned} \bb{E}[ \mathcal{A}\{S_1f + S_2f'\}(\xi) &=\bb{E}[ \mathcal{A}\{S_1f\}(\xi) + \finner{ S_2 f'(t) }{ S_1 f(t-\xi)}{t} +\finner{S_2f'(t) }{S_1 f(t+\xi)}{t} +\mathcal{A}\{S_2f'\}(\xi)]\\ &=\bb{E}[ \mathcal{A}\{S_1f\}(\xi)] + \bb{E}\finner{ S_2 f'(t) }{ S_1 f(t-\xi)}{t} + \bb{E}\finner{S_2f'(t) }{S_1 f(t+\xi)}{t} +\bb{E}[ \mathcal{A}\{S_2f'\}(\xi)]\\ &=\mathcal{A}\{f\}(\xi)+ \bb{E}[ S_1S_2]\finner{ f'(t) }{ f(t-\xi) }{t} + \bb{E}[ S_1S_2]\finner{ f'(t) }{ f(t+\xi) }{t}+\mathcal{A}\{f'\}(\xi)\\ &=\mathcal{A}\{f\}(\xi)+ \mathcal{A}\{f'\}(\xi)\\ \end{aligned}

## Refs

• Lick51: J. C. R. Licklider (1951) A duplex theory of pitch perception. Experientia, 7(4), 128–134. DOI
• SlLy90: M. Slaney, R. F. Lyon (1990) A perceptual pitch detector. In Proceedings of ICASSP (pp. 357–360 vol.1). DOI
• BrPu89: Judith C. Brown, Miller S. Puckette (1989) Calculation of a ‘“narrowed”’ autocorrelation function. The Journal of the Acoustical Society of America, 85(4), 1595–1601. DOI
• Kaso18: Artan Kaso (2018) Computation of the normalized cross-correlation by fast Fourier transform. PLOS ONE, 13(9), e0203434. DOI
• MGBC07: Ning Ma, Phil Green, Jon Barker, André Coy (2007) Exploiting correlogram structure for robust speech recognition with multiple speech sources. Speech Communication, 49(12), 874–891. DOI
• Lewi00: J.P. Lewis (n.d.) Fast Template Matching.
• MPSG11: J. A. Morales-Cordovilla, A. M. Peinado, V. Sanchez, J. A. Gonzalez (2011) Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 640–651. DOI
• Wien30: Norbert Wiener (1930) Generalized harmonic analysis. Acta Mathematica, 55, 117–258. DOI
• Khin34: A. Khintchine (1934) Korrelationstheorie der stationären stochastischen Prozesse. Mathematische Annalen, 109(1), 604–615. DOI
• CaDe96: P. A. Cariani, B. Delgutte (1996) Neural correlates of the pitch of complex tones I Pitch and pitch salience. Journal of Neurophysiology, 76(3), 1698–1716. DOI
• Sond68: M. Sondhi (1968) New methods of pitch extraction. IEEE Transactions on Audio and Electroacoustics, 16(2), 262–266. DOI
• TaAl11: L. N. Tan, A. Alwan (2011) Noise-robust F0 estimation using SNR-weighted summary correlograms from multi-band comb filters. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4464–4467). DOI
• Rabi77: L. Rabiner (1977) On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech, and Signal Processing, 25(1), 24–33. DOI
• Lang92: Gerald Langner (1992) Periodicity coding in the auditory system. Hearing Research, 60(2), 115–142. DOI
• ChKa02: Alain de Cheveigné, Hideki Kawahara (2002) YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4), 1917–1930. DOI