Multivariate normal distribution is normalized (of course): A proof



Let \(\mathbf{X} = (X_1, X_2, \cdots, X_n)^\top\) be a vector of random variables. We say it follows the multivariate normal (Gaussian) distribution if its density is given by

\[f(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^n|\Sigma|}}\exp\left(-\frac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^\top\Sigma^{-1}(\mathbf{x} - \boldsymbol{\mu})\right)\tag{Eq:density}\]

where \(\boldsymbol{\mu} = (\mu_1, \mu_2, \cdots, \mu_n)^\top \in \mathbb{R}^n\) is a vector, \(\Sigma\) is a symmetric positive definite \(n\times n\) matrix, and \(\Sigma^{-1}\) and \(|\Sigma|\) are the inverse and determinant of \(\Sigma\), respectively. It turns out that \(\boldsymbol{\mu}\) and \(\Sigma\) are the mean vector and covariance matrix of \(\mathbf{X}\), respectively. But we will not prove that here. In this post, we will show this density (Eq:density) is normalized (of course). That is, we prove that

\[\int_{\mathbb{R}^n}f(\mathbf{x})d\mathbf{x} = 1.\]

We assume that you already know how to prove the univariate normal distribution is normalized:

\[\int_{-\infty}^{\infty}\frac{1}{\sqrt{2\pi \sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)dx = 1.\]

Let's start!

First, by changing the variables \(\mathbf{y} = \mathbf{x} - \boldsymbol{\mu}\), we need to prove

\[\int_{\mathbb{R}^n}\exp\left(-\frac{1}{2}\mathbf{y}^\top\Sigma^{-1}\mathbf{y}\right)d\mathbf{y} = {\sqrt{(2\pi)^n|\Sigma|}}. \tag{Eq:Goal}\]

What's annoying about this integral is that it contains the cross terms between vector components of \(\mathbf{y}\) such as \(a_{ij}y_iy_j\) where \(a_{ij} = (\Sigma^{-1})_{ij}\). If there were no cross terms, then the only terms are of the form \(a_{ii}y_i^2\), and we can split the multivariate integral into the product of univariate integrals. We can do this by diagonalizing \(\Sigma^{-1}\).

Since \(\Sigma\) is real symmetric, we can diagonalize it by some orthogonal matrix \(U\):

\[\Sigma = USU^\top\]

where \(S = \mathrm{diag}(s_1, s_2, \cdots, s_n)\) is the diagonal matrix of the eigenvalues of \(\Sigma\). Since \(\Sigma\) is positive definite, all the eigenvalues are positive. Accordingly (exercise!), the inverse matrix of \(\Sigma\) is diagonalized as

\[\Sigma^{-1} = US^{-1}U^\top\]

where \(S^{-1} = \mathrm{diag}(1/s_1, 1/s_2, \cdots, 1/s_n)\) is the inverse matrix of \(S\). Let's substitute this into the exponent of the density. We have

\[\begin{eqnarray}\mathbf{y}^\top\Sigma^{-1}\mathbf{y} &=& \mathbf{y}^{\top}US^{-1}U^{\top}\mathbf{y}\\ &=& (U^{\top}\mathbf{y})^{\top}S^{-1}(U^{\top}\mathbf{y})\\ &=& \mathbf{z}^{\top}S^{-1}\mathbf{z} \end{eqnarray}\]

where we set \(\mathbf{z} = U^{\top}\mathbf{y}\). Since \(U\) is orthogonal, \(|U| = \det U = \pm 1\). Therefore, \(d\mathbf{y} = |\det U|d\mathbf{z} = d\mathbf{z}\), so that

\[\int_{\mathbb{R}^n}\exp\left(-\frac{1}{2}\mathbf{y}^\top\Sigma^{-1}\mathbf{y}\right)d\mathbf{y} = \int_{\mathbb{R}^n}\exp\left(-\frac{1}{2}\mathbf{z}^{\top}S^{-1}\mathbf{z}\right)d\mathbf{z}.\]

Since \(S^{-1}\) is a diagonal matrix, the exponent becomes 

\[\mathbf{z}^{\top}S^{-1}\mathbf{z} = z_1^2/s_1 + z_2^2/s_2 + \cdots z_n^2/s_n.\]

Thus,

\[\begin{eqnarray}\int_{\mathbb{R}^n}\exp\left(-\frac{1}{2}\mathbf{z}^{\top}S^{-1}\mathbf{z}\right)d\mathbf{z} &=& \int_{\mathbf{R}^n}\prod_{i=1}^{n}\exp\left(-\frac{z_i^2}{2s_i}\right)d\mathbf{z}\\ &=& \prod_{i=1}^{n}\int_{-\infty}^{\infty}\exp\left(-\frac{z_i^2}{2s_i}\right)dz_i\\ &=& \prod_{i=1}^{n}\sqrt{2\pi s_i}\\ &=& \sqrt{(2\pi)^{n}s_1s_2\cdots s_n}\\&=&\sqrt{(2\pi)^n|\Sigma|}\end{eqnarray}\]

as

\[|\Sigma| = |USU^{\top}| = |U||S||U^{\top}| = |S| = s_1s_2\cdots s_n.\]

Thus, (Eq:Goal) holds.

Comments

Popular posts from this blog

Applications of multiple integrals

Birth process

Improper multiple integrals