Matrix operations

We introduce various operations on matrices. A matrix can be multiplied by a scalar. Two matrices can be added or multiplied. A matrix can be transposed. Writing down these operations explicitly can be tedious, so we introduce a systematic notation for representing matrices and their elements.



We want to denote each entry of a matrix systematically.

Suppose we have a \(2\times 3\) matrix

\[M = \begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix}.\]

This may be written as

\[M = \begin{pmatrix} m_{11} & m_{12} & m_{13}\\ m_{21} & m_{22} & m_{23}\end{pmatrix}\]

where \(m_{11} = 1,\) \(m_{21} = -1\), etc. But if we have a \(100\times 100\) matrix, this way of writing is very tedious, to say the least. In mathematics, it is crucial to be ``appropriately lazy''. So here's what we usually do:

\[M = (m_{ij})\]

where it is understood that the indices \(i\) and \(j\) run through some appropriate ranges (in this particular example, \(i = 1, 2\) and \(j = 1, 2, 3\).) This is a very tidy way. Furthermore, when \(M\) is a matrix, we also write \((M)_{ij}\) to indicate its \((i,j)\)-element (\(i\)-th row, \(j\)-th column). Thus, if we have \(M = (m_{ij})\), we have \((M)_{ij} = m_{ij}\).

Definition (Scalar-matrix multiplication)

Let \(A = (a_{ij})\) be an \(n\times m\) matrix, and \(\lambda \in \mathbb{R}\) be a constant. Then the scalar multiplication of \(A\) by \(\lambda\) is defined by

\[\lambda A = (\lambda a_{ij}).\]

That is,

\[\lambda A = \begin{pmatrix} \lambda a_{11} & \lambda a_{12} & \cdots & \lambda a_{1m}\\ \lambda a_{21} & \lambda _{22} & \cdots & \lambda a_{2m}\\ \vdots & & \ddots & \vdots\\ \lambda a_{n1} & \lambda a_{n2} & \cdots & \lambda a_{nm} \end{pmatrix}. \]

Another way to write this is

\[(\lambda A)_{ij} = \lambda a_{ij}.\]

Example.

\[3\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} = \begin{pmatrix} 3 & 3 & 6\\ -3 & 21 & 6 \end{pmatrix}.\] □

Definition (Matrix addition)

Let \(A = (a_{ij})\) and \(B = (b_{ij})\) be both \(n\times m\) matrices. We define a new \(n\times m\) matrix \(A+B\) by

\[A+B = (a_{ij} + b_{ij}).\]

Another way to write this is

\[(A + B)_{ij} = (A)_{ij} + (B)_{ij}.\]

Remark. To add two matrices, their sizes must be exactly equal. □

Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} + \begin{pmatrix} 3 & 2 & 1\\ 4 & 5 & 6 \end{pmatrix} = \begin{pmatrix} 4 & 3 & 3\\ 3 & 12 & 8 \end{pmatrix}.\] □

Definition (Matrix multiplication)

Let \(A = (a_{ij})\) be an \(n\times m\) matrix and \(B = (b_{ij})\) be an \(m\times p\) matrix. We define a new \(n\times p\) matrix \(AB\) by

\[AB = \left(\sum_{k=1}^{m}a_{ik}b_{kj}\right)\]

Another way to write this is

\[(AB)_{ij} = \sum_{k=1}^{m}(A)_{ik}(B)_{kj}.\]

Remark

  1. To multiply two matrices, the number of columns of the first matrix and the number of rows of the second matrix must be equal.
  2. Even if \(AB\) is defined, \(BA\) may not be defined because the sizes may not match.
  3. In general, matrix multiplication is not commutative. That is, \(AB = BA\) does not hold in general.

Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix} \begin{pmatrix} 1 & 0\\ 0 & 1\\ 2 & 3 \end{pmatrix} = \begin{pmatrix} 5 & 7\\ 3 & 13 \end{pmatrix}.\]

Example

\[ \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} = \begin{pmatrix} -4 & 11 \\ -10 & 25 \end{pmatrix}. \]

\[ \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} = \begin{pmatrix} 7 & 8 \\ 11 & 14 \end{pmatrix}. \]

Thus,

\[\begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix} \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix}\neq \begin{pmatrix} -2 & 3\\ -1 & 4 \end{pmatrix} \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}. \]

Definition (Transpose)

Let \(A = (a_{ij})\) be an \(n\times m\) matrix. The transpose of \(A\), denoted \(A^T\) is the \(m\times n\) matrix whose entry in the \(i\)-th row and \(j\)-th column is \(a_{ji}\). In other words,

\[(A^T)_{ij} = (A)_{ji}.\]

``The \((i,j)\)-element of \(A^T\) is the \((j,i)\)-element of \(A\).''

Example. \[\begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix}^{T} = \begin{pmatrix} 1 & -1\\ 1 & 7\\ 2 & 2 \end{pmatrix}.\] □

Lemma

Let \(A\) be a matrix. Then,

\[(A^T)^T = A.\]

That is, the transpose of the transpose of \(A\) is \(A\) itself. 

Proof

\[((A^T)^T)_{ij} = (A^T)_{ji} = (A)_{ij}.\] ■


\(n\)-dimensional row vectors \(\mathbf{a} = (a_1, \cdots, a_n)\) and \(\mathbf{b} = (b_1, \cdots, b_n)\) may be regarded as \(1\times n\) matrices. Consequently, the transpose \(\mathbf{b}^{T}\), a column vector, may be regarded as an \(n\times 1\) matrix. We can matrix-multiply \(\mathbf{a}\) and \(\mathbf{b}^T\) to have a \(1\times 1\) ``matrix'', that is,

\[\mathbf{a}\mathbf{b}^T = (a_1b_1 + a_2b_2 + \cdots + a_nb_n).\]

If we identify \(1\times 1\) matrices with scalars, this result is nothing but the scalar product. Thus we have

\[\mathbf{a}\mathbf{b}^T = \braket{\mathbf{a},\mathbf{b}}.\]

This observation can be generalized as follows. Suppose the rows of the \(n\times m\) matrix \(A\) are the \(m\)-dimensional row vectors \(\mathbf{a}_1, \mathbf{a}_2,\cdots, \mathbf{a}_n\), that is,

\[A = (a_{ij}) = \begin{pmatrix} \mathbf{a}_1\\ \mathbf{a}_2\\ \vdots\\ \mathbf{a}_n \end{pmatrix}.\]

Similarly, suppose the columns of the \(m\times p\) matrix \(B\) are the \(m\)-dimensional column vectors \(\mathbf{b}_1^T, \mathbf{b}_2^T, \cdots, \mathbf{b}_p^T\), that is,

\[B = (b_{ij}) = \begin{pmatrix} \mathbf{b}_1^T & \mathbf{b}_2^T & \cdots & \mathbf{b}_p^T \end{pmatrix}. \]

Since \(\mathbf{b}_i^T\) is a column vector, its transpose \(\mathbf{b}_i\) is a row vector. We have

\[\begin{eqnarray} AB &=& \begin{pmatrix} \mathbf{a}_1\\ \mathbf{a}_2\\ \vdots\\ \mathbf{a}_n \end{pmatrix} \begin{pmatrix} \mathbf{b}_1^T & \mathbf{b}_2^T & \cdots & \mathbf{b}_p^T \end{pmatrix}\\ &=& \begin{pmatrix} \braket{\mathbf{a}_1, \mathbf{b}_1} & \braket{\mathbf{a}_1, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_1, \mathbf{b}_p}\\ \braket{\mathbf{a}_2, \mathbf{b}_1} & \braket{\mathbf{a}_2, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_2, \mathbf{b}_p}\\ \vdots & & \ddots & \vdots \\ \braket{\mathbf{a}_n, \mathbf{b}_1} & \braket{\mathbf{a}_n, \mathbf{b}_2} & \cdots & \braket{\mathbf{a}_n, \mathbf{b}_p} \end{pmatrix}\\ &=& (\braket{\mathbf{a}_i, \mathbf{b}_j}). \end{eqnarray} \]

Thus, every element of a matrix product \(AB\) is a scalar product between a row vector and a column vector.



Comments

Popular posts from this blog

Open sets and closed sets in \(\mathbb{R}^n\)

Euclidean spaces

Newton's method