Posts

Showing posts from August, 2023

\(e = 2.718\cdots\): Napier's constant

Image
We have defined the constant known as Napier's constant \(e\) in a previous post, as \(e = \exp(1)\) where \(\exp(x)\) is the inverse of \(\log(x)\).  See also : \(\log\) and \(e\) Here we provide an alternative definition of this constant. That is, \[e = \lim_{n\to\infty}\left(1 + \frac{1}{n}\right)^n.\] For this definition to be valid, we need to show that the sequence \(\{a_n\}\) defined by \[a_n = \left(1 + \frac{1}{n}\right)^n\] converges. Example . The first several terms of the above sequence are: \[\begin{align*}a_1 &= 2,\\ a_2 &= 2.25,\\ a_3 &= 2.370\cdots,\\ a_4 &= 2.441\cdots,\\ a_5 &=2.488\cdots,\\ &\vdots\\a_{10}&=2.593\cdots,\\a_{100} &= 2.704\cdots,\\a_{1000} &= 2.716\cdots,\\&\vdots\end{align*}\] □ Lemma The sequence \(\{a_n\}\) defined above is monotone increasing. Proof . We show that \(a_n < a_{n+1}\) for all \(n\in\mathbb{N}\). By the Binomial Theorem, we can write \[a_n = \sum_{k=0}^{n}\frac{s_k}{k!}\] where \[ \b

Monotone sequences and Cauchy sequences

Image
 Deciding whether a given sequence \(\{a_n\}\) converges or diverges is usually very difficult. Sometimes it is possible to decide if a sequence converges without knowing its limit explicitly if certain conditions are met. Definition (Monotone increasing/decreasing sequences) Let \(\{a_n\}\) be a sequence. If \[a_1 \leq a_2 \leq a_3 \leq \cdots \leq a_n \leq a_{n+1} \leq \cdots\] holds, that is,  \[\forall n\in\mathbb{N} ~ (a_n \leq a_{n+1}),\] then \(\{a_n\}\) is said to be monotone increasing . If \[a_1 \geq a_2 \geq a_3 \geq \cdots \geq a_n \geq a_{n+1} \geq \cdots\] hold, that is, \[\forall n\in\mathbb{N} ~ (a_n \geq a_{n+1}),\] then \(\{a_n\}\) is said to be monotone decreasing . If a sequence is either monotone increasing or monotone decreasing, we may simply say it is monotone. Example . Let us define the sequence \(\{a_n\}\) by \[ \begin{eqnarray} a_1 &=& 2,\\ a_{n+1} &=& \frac{1}{2}\left(a_n + \frac{2}{a_n}\right), ~ n = 1, 2, \cdots. \end{eqnarr

Properties of limits

Image
We see some basic properties of the limit of a sequence, such as The limit of a sequence is unique. A convergent sequence is bounded. The squeeze theorem. Limiting operation is linear. The limit of the product of sequences is the product of the limits. etc. Theorem (Uniqueness of limit) If the sequence \(\{a_n\}\) converges, its limit is unique. In other words, if we have \(\lim_{n\to\infty}a_n = \alpha\) and \(\lim_{n\to\infty}a_n = \beta\), then \(\alpha = \beta\). Proof . Suppose that \(\{a_n\}\) converges to \(\alpha\) as well as to \(\beta\). Take any positive real number \(\varepsilon\).  Since \(\lim_{n\to\infty}a_n = \alpha\), there exists \(N'\in\mathbb{N}\) such that for any \(n\geq N'\), \(|a_n - \alpha| < \varepsilon\). Similarly, there exists \(N'' \in\mathbb{N}\) such that for any \(n\geq N''\), \(|a_n - \beta| < \varepsilon\). Let \(N = \max\{N', N''\}\). Then, if \(n \geq N\), \[|\alpha - \beta| = |\alpha - a_n + a_n -\beta| \le

Convergence and divergence of sequences

Image
Consider the sequence \(a_1, a_2, a_3, \cdots\) where each \(a_i\) is a real number, and there is a real number \(a_n\) for each natural number \(n\). We usually denote such a sequence by \(\{a_n\}\) or \(\{a_n\}_{n=1}^{\infty}\). Note that a sequence is not just a set, its order matters. \(a_n\) may ``converge'' to some real number as \(n\) becomes larger and larger. But what does that mean? Definition (Limits) Let \(\{a_n\}\) be a sequence. If \(a_n\) approaches arbitrarily close to a constant value \(\alpha\) as \(n\) becomes arbitrarily large, we call this \(\alpha\) the limit of the sequence \(\{a_n\}\) and write \[\lim_{n\to\infty}a_n = \alpha\] or \[a_n \to \alpha \text{ as } n \to \infty,\] and we say the sequence \(\{a_n\}\) converges to \(\alpha\). In this case, we also say ``the limit of \(\{a_n\}\) is \(\alpha\).'' If the sequence \(\{a_n\}\) does not converge, we say that the sequence diverges . Remark . When we say ``\(a_n\) approaches to \(\alpha\) as \

Continuity of real numbers

Image
We know that \(\mathbb{R}\) (the set of real numbers) and \(\mathbb{Q}\) (the set of rational numbers) share similar properties. Both are fields and dense. What distinguishes real numbers from rational numbers is continuity . We prove Archimedes' principle as a consequence of the continuity of real numbers. Definition (Upper bound, lower bound) Let \(S\subset \mathbb{R}\) and \(\alpha \in \mathbb{R}\).  If, for all \(x\in S\), \(x \leq \alpha\), then \(\alpha\) is said to be an upper bound of \(S\). If, for all \(x\in S\), \(x \geq \alpha\), then \(\alpha\) is said to be a lower bound of \(S\). Example . Let \(A = \{x \mid x \leq 1, x \in \mathbb{R}\}\). Then, 2 is an upper bound of \(A\). 1.5 is another upper bound of \(A\). In fact, any number greater than or equal to 1 is an upper bound of \(A\). There are no lower bounds of \(A\). □ Example . Let \(B = \{x \mid x > \sqrt{2}, x\in\mathbb{Q}\}\). Any number less than or equal to \(\sqrt{2}\) is a lower bound of \(B\). There

Linear independence

Image
We will work with row vectors in \(\mathbb{R}^n\). Consider a set of \(m\) vectors. If one of these vectors can be expressed as a linear combination of the other vectors, these vectors are said to be linearly dependent . If none of these vectors can be expressed as a linear combination of the other vectors, they are said to be linearly independent . We show that the determinant of a matrix is zero if its row vectors are linearly dependent and vice versa. Definition (Linear dependence) We say that a finite sequence \(\mathbf{v}_1, \mathbf{v}_2, \cdots, \mathbf{v}_m\) of vectors is linearly dependent if there are real numbers \(\lambda_1, \lambda_2, \cdots, \lambda_m\) which are not all 0 (i.e., some are non-zero), such that \[\sum_{i=1}^{m}\lambda_i\mathbf{v}_i = 0.\] Why this terminology? Suppose \(\lambda_1 \neq 0\). Then we can rearrange the above equation into \[\mathbf{v}_1 = -(1/\lambda_1)(\lambda_2\mathbf{v}_2 + \cdots + \lambda_m\mathbf{v}_m).\] That is, \(\mathbf{v}_1\) can be

Vector product

Image
The scalar product maps a pair of vectors to a scalar: \(\mathbb{R}^n\times \mathbb{R}^n \to \mathbb{R}\). The scalar product can be defined in the vector space \(\mathbb{R}^n\) with any \(n\). In contrast, the vector product (also known as the cross product ), which maps a pair of vectors to another vector (\(\mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^n\)) in the same vector space, can be defined only in the 3-dimensional space, \(\mathbb{R}^3\). Definition (Vector product) Let \(\mathbf{a}, \mathbf{b}\in\mathbb{R}^3\) be row vectors \(\mathbf{a}= (a_1, a_2, a_3)\) and \(\mathbf{b} = (b_1, b_2, b_3)\). Then their vector product (also called the cross product or outer product ), denoted \(\mathbf{a}\times\mathbf{b}\), is the vector defined as \[\mathbf{a}\times\mathbf{b} = (a_2b_3 - a_3b_2, a_3b_1 - a_1b_3, a_1b_2 - a_2b_1).\] Remark . Some authors prefer to denote the vector product by \([\mathbf{a},\mathbf{b}]\) rather than \(\mathbf{a}\times \mathbf{b}\). □ Let \(\mathbf{e}_1

Properties of matrix determinants

Image
We study the properties of matrix determinants. We can exploit these properties to (sometimes greatly) simplify the computation of matrix determinants. Before studying the properties of determinants, we introduce the important notion of  linearity . Definition (Linear function) Let \(V\) and \(W\) be vector spaces, both over the field \(K\). The function \(f: V \to W\) is said to be  linear  if the following hold. For all \(\mathbf{u}, \mathbf{v} \in V\), \(f(\mathbf{u} + \mathbf{v}) = f(\mathbf{u}) + f(\mathbf{v})\),  For all \(\mathbf{u}\in V\) and \(\lambda \in K\), \(f(\lambda\mathbf{u}) = \lambda f(\mathbf{u})\). Remark . These two conditions can be summarized as \[f(\lambda\mathbf{u} + \mu\mathbf{v}) = \lambda f(\mathbf{u}) + \mu f(\mathbf{v})\] for all \(\mathbf{u}, \mathbf{v}\in V\) and \(\lambda, \mu\in K\). □ Matrix determinants are linear functions of row (or column) vectors (in fact, they are called  multilinear ). Lemma (Determinants are linear) Let \[A = \begin{pmatrix} \

Matrix determinants in general

Image
 The determinant of a general \(n\times n\) matrix is quite complicated. We give a mechanical, recursive definition first and then think about its meaning later. Definition (Determinant) Let \(A = (a_{ij}) \in M_n\). The determinant \(|A|\) of \(A\) is defined recursively in the following manner. If \(n = 1\), then \(|A| = a_{11}\). If \(n > 1\), then let \(A_{ij}\) denote the \((n-1)\times(n-1)\) matrix obtained by removing the \(i\)-th row and the \(j\)-th column from \(A\), and \[|A| = \sum_{j=1}^{n}(-1)^{i+j}a_{ij}|A_{ij}|\] where \(i\) is any arbitrary index from 1 to \(n\). (Instead of a row, you may use an arbitrary column to obtain the same result.) According to this definition, if we want to compute \(|A|\) of an \(n\times n\) matrix, we need to compute the determinants \(|A_{ij}|\) of (many) \((n-1)\times(n-1)\) matrices, which requires computing determinants of \((n-2)\times(n-2)\) matrices, and so on, until we reach the determinants of \(1\times 1\) matrices which are tr

Linear equations and matrices

Image
 We see that simultaneous linear equations can be concisely expressed by using a matrix. Furthermore, we can find a general way to solve such equations based on matrix algebra. Here, we mostly deal with \(2\times 2\) matrices, but the same principle applies to matrices of any size. Definition (Kronecker's delta) Kronecker's delta is the symbol \(\delta_{ij}\) such that \[\delta_{ij} = \left\{ \begin{array}{cl} 1 & \text{if $i = j$},\\ 0 & \text{otherwise}. \end{array}\right. \] Definition (Identity matrix) The \(n\times n\)  identity matrix  is \(I_n = (\delta_{ij})\). The identity matrices have their  diagonal elements  that are all 1 and  off-diagonal elements  that are all 0. Example .  \[\begin{eqnarray} I_1 &=& (1),\\ I_2 &=& \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix},\\ I_3 &=& \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{pmatrix}, \end{eqnarray} \] and so on. □ Theorem Let \(X=(x_{ij})\) be an \(n

Laws of matrix algebra

Image
We can add and multiply matrices. These operations may or may not satisfy certain laws or rules that characterize matrix algebra. We also introduce an algebraic structure called a ring and show that the set of square matrices is a ring. In what follows, we assume that matrices \(A, B\) and \(C\) are such that their shapes are compatible with matrix addition and multiplication. Then you should be able to show at least the first two (distributive laws) of the following \[\begin{eqnarray} A(B + C) &=& (AB) + (AC),\\ (A + B)C &=& (AC) + (BC),\\ (AB)C &=& A(BC). \end{eqnarray} \] Let us show the last one. Theorem (Associativity of matrix multiplication) Let \(A = (a_{ij}), B=(b_{ij})\) and \(C=(c_{ij})\) be matrices. Suppose that \(A\) has \(r\) columns and \(B\) has \(r\) rows, and that \(B\) has \(s\) columns and \(C\) has \(s\) rows. Then \(A(BC) = (AB)C\). In other words, matrix multiplication is associative. Proof . Let \(X = (x_{ij}) = A(BC)\) and \

Matrix operations

Image
We introduce various operations on matrices. A matrix can be multiplied by a scalar. Two matrices can be added or multiplied. A matrix can be transposed. Writing down these operations explicitly can be tedious, so we introduce a systematic notation for representing matrices and their elements. We want to denote each entry of a matrix systematically. Suppose we have a \(2\times 3\) matrix \[M = \begin{pmatrix} 1 & 1 & 2\\ -1 & 7 & 2 \end{pmatrix}.\] This may be written as \[M = \begin{pmatrix} m_{11} & m_{12} & m_{13}\\ m_{21} & m_{22} & m_{23}\end{pmatrix}\] where \(m_{11} = 1,\) \(m_{21} = -1\), etc. But if we have a \(100\times 100\) matrix, this way of writing is very tedious, to say the least. In mathematics, it is crucial to be ``appropriately lazy''. So here's what we usually do: \[M = (m_{ij})\] where it is understood that the indices \(i\) and \(j\) run through some appropriate ranges (in this particular