Matrix operations
We introduce various operations on matrices. A matrix can be multiplied by a scalar. Two matrices can be added or multiplied. A matrix can be transposed. Writing down these operations explicitly can be tedious, so we introduce a systematic notation for representing matrices and their elements.
We want to denote each entry of a matrix systematically.
Suppose we have a \(2\times 3\) matrix
\[M = \begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix}.\]
This may be written as
\[M = \begin{pmatrix} m_{11} & m_{12} & m_{13}\\ m_{21} & m_{22} & m_{23}\end{pmatrix}\]
where \(m_{11} = 1,\) \(m_{21} = -1\), etc. But if we have a \(100\times 100\) matrix, this way of writing is very tedious, to say the least. In mathematics, it is crucial to be ``appropriately lazy''. So here's what we usually do:
\[M = (m_{ij})\]
where it is understood that the indices \(i\) and \(j\) run through some appropriate ranges (in this particular example, \(i = 1, 2\) and \(j = 1, 2, 3\).) This is a very tidy way. Furthermore, when \(M\) is a matrix, we also write \((M)_{ij}\) to indicate its \((i,j)\)-element (\(i\)-th row, \(j\)-th column). Thus, if we have \(M = (m_{ij})\), we have \((M)_{ij} = m_{ij}\).
Definition (Scalar-matrix multiplication)
Let \(A = (a_{ij})\) be an \(n\times m\) matrix, and \(\lambda \in \mathbb{R}\) be a constant. Then the scalar multiplication of \(A\) by \(\lambda\) is defined by
\[\lambda A = (\lambda a_{ij}).\]
That is,
\[\lambda A = \begin{pmatrix} \lambda a_{11} & \lambda a_{12} & \cdots & \lambda a_{1m}\\ \lambda a_{21} & \lambda _{22} & \cdots & \lambda a_{2m}\\ \vdots & & \ddots & \vdots\\ \lambda a_{n1} & \lambda a_{n2} & \cdots & \lambda a_{nm} \end{pmatrix}. \]
Another way to write this is
\[(\lambda A)_{ij} = \lambda a_{ij}.\]
Example.
\[3\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} = \begin{pmatrix} 3 & 3 & 6\\ -3 & 21 & 6 \end{pmatrix}.\] □
Definition (Matrix addition)
Let \(A = (a_{ij})\) and \(B = (b_{ij})\) be both \(n\times m\) matrices. We define a new \(n\times m\) matrix \(A+B\) by
\[A+B = (a_{ij} + b_{ij}).\]
Another way to write this is
\[(A + B)_{ij} = (A)_{ij} + (B)_{ij}.\]
Remark. To add two matrices, their sizes must be exactly equal. □
Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} + \begin{pmatrix} 3 & 2 & 1\\ 4 & 5 & 6 \end{pmatrix} = \begin{pmatrix} 4 & 3 & 3\\ 3 & 12 & 8 \end{pmatrix}.\] □
Definition (Matrix multiplication)
Let \(A = (a_{ij})\) be an \(n\times m\) matrix and \(B = (b_{ij})\) be an \(m\times p\) matrix. We define a new \(n\times p\) matrix \(AB\) by
\[AB = \left(\sum_{k=1}^{m}a_{ik}b_{kj}\right)\]
Another way to write this is
\[(AB)_{ij} = \sum_{k=1}^{m}(A)_{ik}(B)_{kj}.\]
Remark.
- To multiply two matrices, the number of columns of the first matrix and the number of rows of the second matrix must be equal.
- Even if \(AB\) is defined, \(BA\) may not be defined because the sizes may not match.
- In general, matrix multiplication is not commutative. That is, \(AB = BA\) does not hold in general.
Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} \begin{pmatrix} 1 & 0\\ 0 & 1\\ 2 & 3 \end{pmatrix} = \begin{pmatrix} 5 & 7\\ 3 & 13 \end{pmatrix}.\]
□
Example.
\[ \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} = \begin{pmatrix} -4 & 11 \\ -10 & 25 \end{pmatrix}. \]
\[ \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} = \begin{pmatrix} 7 & 8 \\ 11 & 14 \end{pmatrix}. \]
Thus,
\[\begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix}\neq \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}. \]
□
Definition (Transpose)
Let \(A = (a_{ij})\) be an \(n\times m\) matrix. The transpose of \(A\), denoted \(A^T\) is the \(m\times n\) matrix whose entry in the \(i\)-th row and \(j\)-th column is \(a_{ji}\). In other words,
\[(A^T)_{ij} = (A)_{ji}.\]
``The \((i,j)\)-element of \(A^T\) is the \((j,i)\)-element of \(A\).''
Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix}^{T} = \begin{pmatrix} 1 & -1\\ 1 & 7\\ 2 & 2 \end{pmatrix}.\] □
Lemma
Let \(A\) be a matrix. Then,
\[(A^T)^T = A.\]
That is, the transpose of the transpose of \(A\) is \(A\) itself.
Proof.
\[((A^T)^T)_{ij} = (A^T)_{ji} = (A)_{ij}.\] ■
\(n\)-dimensional row vectors \(\mathbf{a} = (a_1, \cdots, a_n)\) and \(\mathbf{b} = (b_1, \cdots, b_n)\) may be regarded as \(1\times n\) matrices. Consequently, the transpose \(\mathbf{b}^{T}\), a column vector, may be regarded as an \(n\times 1\) matrix. We can matrix-multiply \(\mathbf{a}\) and \(\mathbf{b}^T\) to have a \(1\times 1\) ``matrix'', that is,
\[\mathbf{a}\mathbf{b}^T = (a_1b_1 + a_2b_2 + \cdots + a_nb_n).\]
If we identify \(1\times 1\) matrices with scalars, this result is nothing but the scalar product. Thus we have
\[\mathbf{a}\mathbf{b}^T = \braket{\mathbf{a},\mathbf{b}}.\]
This observation can be generalized as follows. Suppose the rows of the \(n\times m\) matrix \(A\) are the \(m\)-dimensional row vectors \(\mathbf{a}_1, \mathbf{a}_2,\cdots, \mathbf{a}_n\), that is,
\[A = (a_{ij}) = \begin{pmatrix} \mathbf{a}_1\\ \mathbf{a}_2\\ \vdots\\ \mathbf{a}_n \end{pmatrix}.\]
Similarly, suppose the columns of the \(m\times p\) matrix \(B\) are the \(m\)-dimensional column vectors \(\mathbf{b}_1^T, \mathbf{b}_2^T, \cdots, \mathbf{b}_p^T\), that is,
\[B = (b_{ij}) = \begin{pmatrix} \mathbf{b}_1^T & \mathbf{b}_2^T & \cdots & \mathbf{b}_p^T \end{pmatrix}. \]
Since \(\mathbf{b}_i^T\) is a column vector, its transpose \(\mathbf{b}_i\) is a row vector. We have
\[\begin{eqnarray} AB &=& \begin{pmatrix} \mathbf{a}_1\\ \mathbf{a}_2\\ \vdots\\ \mathbf{a}_n \end{pmatrix} \begin{pmatrix} \mathbf{b}_1^T & \mathbf{b}_2^T & \cdots & \mathbf{b}_p^T \end{pmatrix}\\ &=& \begin{pmatrix} \braket{\mathbf{a}_1, \mathbf{b}_1} & \braket{\mathbf{a}_1, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_1, \mathbf{b}_p}\\ \braket{\mathbf{a}_2, \mathbf{b}_1} & \braket{\mathbf{a}_2, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_2, \mathbf{b}_p}\\ \vdots & & \ddots & \vdots \\ \braket{\mathbf{a}_n, \mathbf{b}_1} & \braket{\mathbf{a}_n, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_n, \mathbf{b}_p} \end{pmatrix}\\ &=& (\braket{\mathbf{a}_i, \mathbf{b}_j}). \end{eqnarray} \]
Thus, every element of a matrix product \(AB\) is a scalar product between a row vector and a column vector.
Comments
Post a Comment