Properties of matrix determinants

We study the properties of matrix determinants. We can exploit these properties to (sometimes greatly) simplify the computation of matrix determinants.



Before studying the properties of determinants, we introduce the important notion of linearity.

Definition (Linear function)

Let \(V\) and \(W\) be vector spaces, both over the field \(K\). The function \(f: V \to W\) is said to be linear if the following hold.
  1. For all \(\mathbf{u}, \mathbf{v} \in V\), \(f(\mathbf{u} + \mathbf{v}) = f(\mathbf{u}) + f(\mathbf{v})\), 
  2. For all \(\mathbf{u}\in V\) and \(\lambda \in K\), \(f(\lambda\mathbf{u}) = \lambda f(\mathbf{u})\).
Remark. These two conditions can be summarized as
\[f(\lambda\mathbf{u} + \mu\mathbf{v}) = \lambda f(\mathbf{u}) + \mu f(\mathbf{v})\] for all \(\mathbf{u}, \mathbf{v}\in V\) and \(\lambda, \mu\in K\). □

Matrix determinants are linear functions of row (or column) vectors (in fact, they are called multilinear).

Lemma (Determinants are linear)

Let \[A = \begin{pmatrix} \mathbf{a}_1\\ \mathbf{a}_2\\ \vdots\\ \mathbf{a}_n \end{pmatrix}\]
be an \(n\times n\) matrix such that each \(\mathbf{a}_i, i = 1, \cdots, n\) is an \(n\)-dimensional row vector.
  1. If \(\mathbf{a}_i = \mathbf{b}_i + \mathbf{c}_i\) for a particular \(i\), we have \[\begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{b}_i + \mathbf{c}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix} = \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{b}_i \\ \vdots\\ \mathbf{a}_n \end{vmatrix} + \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{c}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix}. \]
  2. If \(\lambda \in \mathbb{R}\), then \[\begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \lambda\mathbf{a}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix} = \lambda \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{a}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix}.\]
Proof
  1. By the definition of determinant, noting that \(\mathbf{b}_i + \mathbf{c}_i = (b_{i1} + c_{i1}, \cdots, b_{ij} + c_{ij}, \cdots, b_{in} + c_{in})\), \[\begin{eqnarray*} \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{b}_i + \mathbf{c}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix} &=& \sum_{j=1}^{n}(-1)^{i+j}(b_{ij} + c_{ij})|A_{ij}|\\ &=& \sum_{j=1}^{n}(-1)^{i+j}b_{ij}|A_{ij}| + \sum_{j=1}^{n}(-1)^{i+j}c_{ij}|A_{ij}|\\ &=& \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{b}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix} + \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{c}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix}. \end{eqnarray*} \]
  2. (exercise).

Lemma (anti-symmetry of determinants)

  1. If two rows are swapped, the sign of a determinant changes. \[ \begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{a}_i\\ \vdots\\ \mathbf{a}_{i'}\\ \vdots\\ \mathbf{a}_n \end{vmatrix} = -\begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{a}_{i'}\\ \vdots\\ \mathbf{a}_i\\ \vdots\\ \mathbf{a}_n \end{vmatrix}.\]
  2. If two rows are the same, the determinant is 0. If \(\mathbf{a}_i = \mathbf{a}_j = \mathbf{b}\), \[\begin{vmatrix} \mathbf{a}_1\\ \vdots\\ \mathbf{b}\\ \vdots\\ \mathbf{b}\\ \vdots\\ \mathbf{a}_n \end{vmatrix} = 0.\]
Proof
1. We prove this by mathematical induction. If \(n = 1\), there is nothing to prove (no rows to be swapped). If \(n = 2\), \[\begin{vmatrix} a & b\\ c & d \end{vmatrix} = ad - bc\] and \[\begin{vmatrix} c & d\\ a & b \end{vmatrix} = bc - ad,\] so \[\begin{vmatrix} a & b\\ c & d \end{vmatrix} = -\begin{vmatrix} c & d\\ a & b \end{vmatrix}.\]
Suppose the claim holds for \(n = m\). Now consider a \((m+1)\times(m+1)\) matrix \(A\) and let \(B\) be the matrix that is the same as \(A\) except that its \(k\)-th and \(l\)-th rows are swapped. \[|A| = \sum_{j=1}^{m+1}(-1)^{i+j}a_{ij}|A_{ij}|\]
where each \(A_{ij}\) is a \(m\times m\) matrix obtained by removing the \(i\)-th row and \(j\)-th column of \(A\). We may assume \(i\neq k, l\). If we swap the \(k\)-th and \(l\)-th rows of \(A\), then the corresponding rows in each of \(A_{ij}\) are swapped, which we denote by \(B_{ij}\). By the inductive hypothesis, we have
\[|A_{ij}| = -|B_{ij}|\]
so that
\[\begin{eqnarray} |B| &=& \sum_{j=1}^{m+1}(-1)^{i+j}a_{ij}|B_{ij}|\\ &=& -\sum_{j=1}^{m+1}(-1)^{i+j}a_{ij}|A_{ij}|\\ &=& -|A| \end{eqnarray}\]
which completes the proof.
2. Suppose that two rows are identical, say, \(\mathbf{a}_k = \mathbf{a}_l\). Then swapping these rows does not change the determinant. On the other hand, by Part 1, they should have opposite signs. Thus, this determinant must be 0.

Let us summarize the properties of determinants.
  1. \(\det\) is a function \(\mathbb{R}^{n^2} \to \mathbb{R}\).
  2. \(\det\) can be also considered as a function \[\mathbb{R}^n\times\mathbb{R}^n\times\cdots \times \mathbb{R}^n \to \mathbb{R}\] where each \(\mathbb{R}^n\) corresponds to a row (or column) vector of the matrix.
  3. \(\det\) is a multilinear function. This means that it is linear in each row (or column) as we saw in the above lemma. 
  4. Swapping the values of two arguments (i.e., rows) changes the sign of the value of the function.
  5. If two arguments (i.e., rows) take the same value, then the value of the function is 0.
Now let us have a look at the determinant of the \(3\times 3\) matrix \(X\) again, \[X = \begin{pmatrix} x_{11} & x_{12} & x_{13}\\ x_{21} & x_{22} & x_{23}\\ x_{31} & x_{32} & x_{33} \end{pmatrix}.\]
If we define the following unit row vectors \[\begin{eqnarray} \mathbf{e}_1 &=& (1, 0, 0),\\ \mathbf{e}_2 &=& (0, 1, 0),\\ \mathbf{e}_3 &=& (0, 0, 1), \end{eqnarray}\]
then we can write, for example, the first row as
\[\mathbf{x}_1 = (x_{11}, x_{12}, x_{13}) = x_{11}\mathbf{e}_1 + x_{12}\mathbf{e}_2 + x_{13}\mathbf{e}_3,\]
and so on. Therefore, by applying the above lemmas repeatedly, we have
\[\begin{eqnarray*} |X| &=& \begin{vmatrix} x_{11}\mathbf{e}_1 + x_{12}\mathbf{e}_2 + x_{13}\mathbf{e}_3\\ x_{21}\mathbf{e}_1 + x_{22}\mathbf{e}_2 + x_{23}\mathbf{e}_3\\ x_{31}\mathbf{e}_1 + x_{32}\mathbf{e}_2 + x_{33}\mathbf{e}_3 \end{vmatrix}\\ &=& x_{11}x_{22}x_{33} \begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_2\\ \mathbf{e}_3 \end{vmatrix} + x_{11}x_{23}x_{32} \begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_3\\ \mathbf{e}_2 \end{vmatrix} + x_{12}x_{23}x_{31} \begin{vmatrix} \mathbf{e}_2\\ \mathbf{e}_3\\ \mathbf{e}_1 \end{vmatrix} + x_{12}x_{21}x_{33} \begin{vmatrix} \mathbf{e}_2\\ \mathbf{e}_1\\ \mathbf{e}_3 \end{vmatrix}\\ && + x_{13}x_{21}x_{32} \begin{vmatrix} \mathbf{e}_3\\ \mathbf{e}_1\\ \mathbf{e}_2 \end{vmatrix} + x_{13}x_{22}x_{31} \begin{vmatrix} \mathbf{e}_3\\ \mathbf{e}_2\\ \mathbf{e}_1 \end{vmatrix}. \end{eqnarray*}\]
Note the order of the unit vectors in each term: They cover all the \(3!=6\) permutations. We can swap the rows in each term so that it becomes the identity matrix (with an appropriate change of signs), and note that
\[\begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_2\\ \mathbf{e}_3 \end{vmatrix} = \begin{vmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{vmatrix} = 1,\]
and we obtain the familiar result:
\[|X| = x_{11}x_{22}x_{33} - x_{11}x_{23}x_{32} + x_{12}x_{23}x_{31} - x_{12}x_{21}x_{33} +x_{13}x_{21}x_{32} - x_{13}x_{22}x_{31}.\]

Theorem (The determinant of a product is the product of determinants)

Let \(X, Y \in M_n\). Then
\[\det(XY) = \det(X)\det(Y).\]
Proof. Let \(X = (x_{ij})\) and let \(\mathbf{y}_i = (y_{i1}, \cdots, y_{in})\) denote the \(i\)-th row of \(Y\). Also, let \(\mathbf{e}_i\) denote the \(i\)-th row of the identity matrix \(I_n\). Then
\[(XY)_{ij} = \sum_{k=1}^{n}x_{ik}y_{kj}= \sum_{k=1}^{n}x_{ik}(\mathbf{y}_k)_j\]
so we can write
\[ |XY| = \begin{vmatrix} x_{11}\mathbf{y}_1 + x_{12}\mathbf{y}_2 + \cdots + x_{1n}\mathbf{y}_n\\ x_{21}\mathbf{y}_1 + x_{22}\mathbf{y}_2 + \cdots + x_{2n}\mathbf{y}_n\\ \vdots\\ x_{n1}\mathbf{y}_1 + x_{n2}\mathbf{y}_2 + \cdots + x_{nn}\mathbf{y}_n \end{vmatrix}.\tag{eq:xy1} \] By using multilinearity of the determinant, this reduces to 
 \[ = \sum_{\sigma} x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} \begin{vmatrix} \mathbf{y}_{\sigma(1)}\\ \mathbf{y}_{\sigma(2)}\\ \vdots\\ \mathbf{y}_{\sigma(n)} \end{vmatrix}\tag{eq:xy2} \]
where the summation \(\sum_{\sigma}\) is over all permutations \((1,2,\cdots,n) \to (\sigma(1),\sigma(2),\cdots, \sigma(n))\). Rearranging the order of \(\mathbf{y}_{\sigma(1)}, \mathbf{y}_{\sigma(2)},\cdots, \mathbf{y}_{\sigma(n)}\) to
    \(\mathbf{y}_{1}, \mathbf{y}_{2},\cdots, \mathbf{y}_{n}\) in each term results in changes of signs (denoted \(\text{sign}(\sigma)\)). So we can write
\[|XY| = \sum_{\sigma}\text{sign}(\sigma)x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} \begin{vmatrix} \mathbf{y}_{1}\\ \mathbf{y}_{2}\\ \vdots\\ \mathbf{y}_{n} \end{vmatrix}\tag{eq:xy3}\]
Now, noting that
\[|I_n| = \begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_2\\ \vdots\\ \mathbf{e}_n \end{vmatrix} = 1,\]
we have
\[\begin{eqnarray*} \sum_{\sigma}\text{sign}(\sigma)x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} &=& \sum_{\sigma}\text{sign}(\sigma)x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} \begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_2\\ \vdots\\ \mathbf{e}_n \end{vmatrix}\\ &=& \sum_{\sigma}x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} \begin{vmatrix} \mathbf{e}_{\sigma(1)}\\ \mathbf{e}_{\sigma(2)}\\ \vdots\\ \mathbf{e}_{\sigma(n)} \end{vmatrix} \end{eqnarray*}\]
where we reordered the last factor to eliminate \(\text{sign}(\sigma)\). Following the argument from ({eq:xy1) to (eq:xy2) in backwards, with \(\mathbf{y}_i\)'s replaced with \(\mathbf{e}_j\)'s, we have
\[\sum_{\sigma}x_{1\sigma(1)}x_{2\sigma(2)}\cdots x_{n\sigma(n)} \begin{vmatrix} \mathbf{e}_{\sigma(1)}\\ \mathbf{e}_{\sigma(2)}\\ \vdots\\ \mathbf{e}_{\sigma(n)} \end{vmatrix} = \begin{vmatrix} x_{11}\mathbf{e}_1 + x_{12}\mathbf{e}_2 + \cdots + x_{1n}\mathbf{e}_n\\ x_{21}\mathbf{e}_1 + x_{22}\mathbf{e}_2 + \cdots + x_{2n}\mathbf{e}_n\\ \vdots\\ x_{n1}\mathbf{e}_1 + x_{n2}\mathbf{e}_2 + \cdots + x_{nn}\mathbf{e}_n \end{vmatrix} = |X|.\]
Combining with (eq:xy3), we have
\[|XY| = |X|\cdot|Y|.\]
Remark. In the above proof, the equalities such as
\[\begin{eqnarray} \begin{vmatrix} \mathbf{y}_{\sigma(1)}\\ \mathbf{y}_{\sigma(2)}\\ \vdots\\ \mathbf{y}_{\sigma(n)} \end{vmatrix} &=& \text{sign}(\sigma)\begin{vmatrix} \mathbf{y}_1\\ \mathbf{y}_2\\ \vdots\\ \mathbf{y}_n \end{vmatrix}\\ \text{sign}(\sigma)\begin{vmatrix} \mathbf{e}_1\\ \mathbf{e}_2\\ \vdots\\ \mathbf{e}_n \end{vmatrix} &=& \begin{vmatrix} \mathbf{e}_{\sigma(1)}\\ \mathbf{e}_{\sigma(2)}\\ \vdots\\ \mathbf{e}_{\sigma(n)} \end{vmatrix} \end{eqnarray}\]
assume the following facts:
  1. The permutation from \((1, 2, \cdots, n)\) to \((\sigma(1), \sigma(2), \cdots, \sigma(n))\) can be achieved by swapping two entries finitely many times.
  2. The number of such swap operations is not unique (e.g., you can swap the same two elements 0, 2, 4, or any even number of times to have the same permutation), but its parity (whether the number of swaps is odd or even) is unique. If the parity is odd, \(\text{sign}(\sigma) = -1\), if the parity is even, \(\text{sign}(\sigma) = 1\).
See alsoPermutation group (Wikipedia)

Theorem (Determinant of transposed matrix)

For any square matrix \(A\), we have
\[|A| = |A^T|.\]
Proof. Exercise. ■

Definition (Adjugate matrix)

Let \(A = (a_{ij}) \in M_n\). We define the adjugate matrix \(A^* = (a_{ij}^*)\) by
\[a_{ij}^* = (-1)^{i+j}|A_{ji}|\]
where \(A_{ji}\) is the matrix obtained by removing the \(j\)-th row and the \(i\)-th column from \(A\). 

Example. For \(\begin{pmatrix}a & b\\c & d\end{pmatrix}\), its adjugate matrix is \(A^{*} = \begin{pmatrix}d & -b\\-c & a\end{pmatrix}.\) □

Theorem

Let \(A = (a_{ij}) \in M_n\) and \(A^*\) its adjugate.
  1. \(AA^* = A^*A = \det(A)\cdot I_n\).
  2. \(A\) has a right inverse if and only if \(\det(A) \neq 0\).
  3. If \(A\) has a right inverse \(X\), then \(X\) is also a left inverse of \(A\), and is also the unique inverse on either side.
Proof
  1. Let \(Z = (z_{ij}) = AA^*\). Then \[z_{ij} = \sum_{k=1}^{n}a_{ik}a_{kj}^* = \sum_{k=1}^{n}a_{ik}(-1)^{j+k}|A_{jk}|\] If \(i = j\), then this is a definition of \(|A|\). If \(i\neq j\), this can be regarded as a determinant of some matrix. That matrix is almost the same as \(A\) except that the \(j\)-th row is replaced with the \(i\)-th row of \(A\) so that it has two rows with the same value. Hence this determinant is 0. Thus, \[z_{ij} = |A|\cdot \delta_{ij}.\] That is, \(Z  = |A|\cdot I_n.\) The case for \(A^*A\) is similar.
  2. Let the right inverse of \(A\) be \(X\). Then we have \(AX = I_n\). Thus \(|AX| = |I_n| = 1\). But \(|AX| = |A|\cdot|X|\). Therefore \(|A|\neq 0\). Conversely, if \(|A| \neq 0\), then \((1/|A|)A^{*}\) is a right inverse from Part 1.
  3. Suppose \(AY = I_n\). Multiplying both sides by \((1/|A|)A^*\) from the left yields \(Y = (1/|A|)A^*\). Similar for \(YA = I_n\).
The unique inverse of \(A\) is denoted \(A^{-1}\).

Theorem

Let \(A, B \in M_n\). If their inverses exist, then
\[(AB)^{-1} = B^{-1}A^{-1}.\]
Proof
\[(AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1}= AI_nA^{-1} = AA^{-1} = I_n.\]
Similarly, \((B^{-1}A^{-1})(AB) = I_n\). ■


Comments

Popular posts from this blog

Open sets and closed sets in \(\mathbb{R}^n\)

Euclidean spaces

Newton's method