Posts

Showing posts from September, 2022

Method of Lagrange multipliers

Image
We have studied how to identify extreme values of two-variable functions \(f(x,y)\). In practice, we may have additional constraints. For example, Find the extreme values of \(f(x,y)\) subject to the constraint \(g(x,y) = 0\). We can use the method of Lagrange multipliers to solve this type of problem. Let's consider the following example. Problem . If \((x,y)\) moves on the unit circle \(x^2 + y^2 = 1\), find the extreme values of \(f(x,y) = x^2 + xy + y^2\). □ This problem can be restated as follows:  Problem (restated) . Let \(g(x,y) = x^2 + y^2 - 1\). Find the extreme values of \(f(x,y) = x^2 + xy + y^2\) subject to the constraint \(g(x,y) = 0\). □ Let's solve this problem using an "explicit" method. Solution 1 (Explicit method) . Consider the implicit function \(y = \sqrt{1 - x^2}\) of \(g(x,y)\) on the open interval \((-1, 1)\) and substitute it to \(f(x, y)\) to have \[h(x) = f(x,y(x)) = x^2 + x\sqrt{1-x^2} + (\sqrt{1-x^2})^2 = x\sqrt{1-x^2}+1.\] Since \[h'

Implicit functions

Image
Some functions are not explicitly defined in the form of \(y = f(x)\), but are determined by some relation between the variables \(x\) and \(y\). Such functions are called implicit functions . What are implicit functions? Consider the graph of a univariate function \(y = f(x)\). It is a set of points defined as \[\{(x,y) \in \mathbb{R}^2 \mid y = f(x)\}.\] Next, consider the equation \(F(x,y) = 0\) where \(F\) is some bivariate function of \(x\) and \(y\). Can the set \[\{(x,y) \in \mathbb{R}^2 \mid F(x,y) = 0\}\] represent the graph of some function? The short answer is "No." For example, consider the unit circle defined by \[F(x,y) = x^2 + y^2 - 1 = 0.\] For each \(x = a, -1 < a < 1\), we have the two values \[y = \sqrt{1 - a^2}\] or \[y = -\sqrt{1 - a^2}.\] Therefore, \(F(x,y) = 0\), or the set \[\{(x,y) \in\mathbb{R}^2 \mid F(x,y) = 0\},\] does not define a function of \(x\). However, we can define a function based on a subset of the above set. For instance, the sub

Extreme values of multivariate functions

Image
A local maximum (or minimum) value of a function is the maximum (or minimum) value of the function in the neighbor of a point. More formally, Definition (Local maximum, local minimum) Let \(f(x,y)\) be a function on an open region \(U\subset \mathbb{R}^2\) and \(P=(a,b) \in U\). \(f(a,b)\) is said to be a local maximum value of the function \(f(x,y)\) if there exists \(\delta > 0\) such that, for all \((x,y)\in U\), if \((x,y)\in N_{\delta}(P)\cap U\) and \((x,y) \neq (a,b)\), then \(f(x,y) < f(a,b)\). \(f(a,b)\) is said to be a local minimum value of the function \(f(x,y)\) if there exists \(\delta > 0\) such that, for all \((x,y)\in U\), if \((x,y)\in N_{\delta}(P)\cap U\) and \((x,y) \neq (a,b)\), then \(f(x,y) > f(a,b)\). Local maximum and minimum values are collectively called extreme values. Theorem (Necessary condition for extreme values) Let \(f(x,y)\) be a function on an open region \(U\), and \((a,b)\in U\). Suppose that \(f_x(a,b)\) and \(f_y(a,b)\) exist. If \

Taylor's theorem for multivariate functions

Image
Taylor's theorem As we have seen in a previous post, the essence of differentiation is a linear approximation.  See also: Partial and total differentiation of multivariate functions Given a totally differentiable function \(z = f(x,y)\), we have in the neighbor of \(P = (a,b)\), \[f(x,y) = f(a,b) + f_x(a,b)(x - a) + f_y(a,b)(y - b) + o(\|X - P\|)\] where \(X = (x,y).\) By discarding the higher order terms \(o(\|X - P\|)\), we have the equation of the tangent plane at \(P\): \[z = f(a,b) + f_x(a,b)(x - a) + f_y(a,b)(y - b)\] that gives the linear approximation of \(z = f(x,y)\) around the point \(P\). But we may obtain better approximations by looking into the details of \(o(\|X - P\|)\), that is, by incorporating higher-order derivatives. Just as in the univariate case, we have a multivariate version of Taylor's theorem, which can be stated concisely and proved easily by using differential operators. See also: Differential operators Theorem (Taylor's theorem) Let \(f(x,y)

Differential operators

Image
Given the function \(f(x,y)\), we have considered its partial derivatives, such as \[\begin{eqnarray*} f_x(x,y) &=&\frac{\partial}{\partial x}f(x,y),\\ f_y(x,y) &=&\frac{\partial}{\partial y}f(x,y). \end{eqnarray*}\] We may interpret partial differentiation in the following manner: The derivative \(f_x(x,y)\) is obtained by applying \(\frac{\partial}{\partial x}\) to the function \(f(x,y)\) from the left. \(\frac{\partial}{\partial x}\) is neither a number nor a function . It's something different that we call a (partial) differential operator . The same argument applies to \(\frac{\partial}{\partial y}\). This interpretation of differential operators turns out to be useful in many situations. For example, for any constants \(a, b\in \mathbb{R}\), we may consider the following operator \(D\): \[D = a\frac{\partial}{\partial x} + b\frac{\partial}{\partial y}.\] If we apply this operator to the function \(f(x,y)\) from the left, we obtain \(af_x(x,y) + bf_y(x,

Higher-order partial differentiation

Image
Given the function \(z = f(x,y)\), suppose that the partial derivatives \(f_x(x,y)\) and \(f_y(x,y)\) exist, and they also have partial derivatives. For example, the partial derivative of \(f_x(x,y)\) with respect to \(y\), \[(f_x)_y(x,y) = \frac{\partial}{\partial y}\left(\frac{\partial f}{\partial x}\right)(x,y) = \frac{\partial}{\partial y}\left(\frac{\partial z}{\partial x}\right)(x,y),\] is denoted as \(\frac{\partial^2f}{\partial y\partial x}(x,y)\) or \(f_{xy}(x,y)\). Similarly, we may define \(f_{xx}(x,y)\), \(f_{yx}(x,y)\), \(f_{yy}(x,y)\). These are called second partial derivatives . Example . Let \(f(x,y) = \log(x^2 + xy + 2y^2)\). Then, \[\begin{eqnarray*} f_x(x,y) &=& \frac{2x + y}{x^2 + xy + 2y^2},\\ f_y(x,y) &=& \frac{x + 4y}{x^2 + xy + 2y^2},\\ f_{xx}(x,y) &=& -\frac{2x^2 + 2xy - 3y^2}{(x^2 + xy + 2y^2)^2},\\ f_{xy}(x,y) &=& -\frac{x^2 + 8xy + 2y^2}{(x^2 + xy + 2y^2)^2},\\ f_{yx}(x,y) &=& -\frac{x^2 + 8xy +

The Implicit Function Theorem: A Proof

Image
We prove the Implicit Function Theorem for the bivariate case. See also: Theorem (Implicit function theorem) Let \(F(x,y)\) be a function of class \(C^1\) on an open region \(U \subset \mathbb{R}^2\). Suppose the point \(P= (a,b)\) satisfies the following conditions: \(F(a,b) = 0\) (i.e., \(P\) is a point on the curve \(F(x,y) = 0\)), \(F_y(a,b) \neq 0\). Then, there exist an open interval \(I\) on the \(x\)-axis such that \(a \in I\) and a univariate function \(y = \varphi(x)\) on \(I\) such that \(F(x,\varphi(x)) = 0\) for all \(x \in I\), that is, \(\varphi(x)\)  is an implicit function of \(F(x,y) = 0\), and \(b = \varphi(a)\). Furthermore, the function \(\varphi(x)\) is differentiable on \(I\) and \[\varphi'(x) = -\frac{F_x(x,\varphi(x))}{F_y(x,\varphi(x))}.\tag{Eq:IF}\] Proof . Since \(F_y(a,b) \neq 0\), either \(F_y(a,b) > 0\) or \(F_y(a,b) < 0\). In the following, we assume \(F_y(a,b) > 0\) (the other case is similar). We prove the theorem in three steps. Step 1. C

Multivariate chain rules

Image
Next, we consider differentiating a composite function such as \(f(\varphi(t), \psi(t))\) or \(f(\varphi(u, v), \psi(u, v))\). In the case of univariate functions, we have the chain rule . That is, \[\frac{dg(f(x))}{dx} = g'(f(x))\cdot f'(x).\] We have corresponding multivariate versions. Theorem [Chain rule (1)] Let \(f(x,y)\) be a totally differentiable function on an open region \(U (\subset \mathbb{R}^2)\). Let \(x = \varphi(t), y = \psi(t)\) be differentiable functions on an open interval \(I\). Suppose that \((\varphi(t), \psi(t))\in U\) for all \(t \in I\). Then the function \(z = f(\varphi(t), \psi(t))\) of \(t\) on \(I\) is differentiable on \(I\), and its derivative is given by \[\frac{d}{dt}f(\varphi(t),\psi(t)) = \frac{\partial f}{\partial x}(\varphi(t),\psi(t))\frac{d\varphi}{dt}(t) + \frac{\partial f}{\partial y}(\varphi(t),\psi(t))\frac{d\psi}{dt}(t).\tag{Eq:ChainRule1}\] Remark .  Eq. (Eq:ChainRule1) may be a little too cluttered and difficult to read. We could

Contraction mapping principle

Image
The contraction mapping principle, also known as Banach's fixed-point theorem, is a very interesting and useful theorem. It says: Given a "contraction" map \(F: S \to S\) where \(S\) is a closed set, we have a unique solution \(x \in S\) to the equation \[x = F(x).\] Let's prove it. But before that, we need some preparation. Accompanying video: Definition (Norm) Let \(X\) be a vector space over the field \(K\). The function \(\| \cdot \|: X \to \mathbb{R}\) is said to be a norm if it satisfies the following axioms: For all \(u \in X\), \(\|u \| \geq 0\). In particular, \[\|u \| = 0 \iff u = 0.\] For all \(\alpha \in K\) and \(u \in X\), \[\|\alpha u \| = |\alpha|\|u \|.\] (Triangle inequality) For all \(u, v\in X\), \[\|u + v\| \leq \|u\| + \|v\|.\] Example . For \(x = (x_1, x_2, \cdots, x_n) \in \mathbb{R}^n\),  \[\|x\| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}\] is a norm. This norm is called the Euclidean norm  or \(L^2\) norm .□ Example . For \(x = (x_1, x_2, \cdo

Total differentiability implies continuity

Image
Recall that the function \(f(x,y)\) is totally differentiable at \((a,b)\) if  \[f(x,y) = f(a,b) + m(x-a) + n(y-b) + o(\|X-P\|) \text{ as $X = (x,y)\to P = (a,b)$}\] for some constants \(n\) and \(m\). See also:  Partial and total differentiation of multivariate functions . Just like univariate functions, we have the following result: Theorem (Total differentiability implies continuity) If the function \(f(x,y)\) is totally differentiable at \((a,b)\), then it is continuous at \((a,b)\). Proof . Let us use the notations \(X=(x,y)\) and \(P= (a,b)\). By the definition of total differentiability, we have \[f(x,y) = f(a,b) + m(x - a) + n(y - b) + o(\|X - P\|)\] for some constants \(n, m\). Then, \[\lim_{(x,y) \to (a,b)}f(x,y) = f(a,b).\] Hence, \(f(x,y)\) is continuous at \((a,b)\). ■ Remark . Just as for the case of univariate functions, the converse of this theorem does not hold. For example, you should verify that \(f(x,y) = |x-y|\) is continuous everywhere in \(\mathbb{R}^2\), but n

Partial and total differentiation of multivariate functions

Image
A multivariate function may be differentiated with respect to each variable, which is called partial differentiation . By combining all the partial differentiations, we define total differentiation . The essence of (total) differentiation is a linear approximation. In the case of a univariate function, we approximate the function \(y = f(x)\) in the neighbor of a point, say \(x = a\), by the tangent line \(y = f'(a)(x - a) + f(a)\). In the case of a multivariate function, we approximate the function \(y = f(x_1, x_2, \cdots, x_n)\) in the neighbor of a point, say \(a = (a_1, a_2, \cdots, a_n)\), by the tangent hyperplane at the point \(a\). Partial differentiation Let \(f(x,y)\) be a function on an open region \(U\subset \mathbb{R}^2\) and \((a,b) \in U\). If we fix \(y = b\) in \(f(x,y)\), we have a univariate function \(g(x) = f(x,b)\). Since \(U\) is open, there exists \(\delta > 0\) such that \(N_{\delta}(a,b) \subset U\). Therefore \(g(x)\) is defined on the open interval

Simple epidemic models

Image
Epidemiology is the study of the spread of diseases in a population. Many students seem interested in this kind of problem due to the recent Covid-19 pandemic. Here, we develop a few simple models of the epidemic. First, some terminology. Susceptibles : Those who might succumb to the disease. Infectives : Those with the disease can spread it among the susceptibles. Immunes : Those who are immune to the disease. This category may include the dead and isolated. Latent period : A period between the infection and onset of the disease. Infectious period : A period during which the individual remains infective. In this post, we only deal with simple epidemic models in which there are only susceptibles and infectives, and individuals do not die. Continuous-time epidemic model (with no recovery) Let us derive a simple continuous-time epidemic model based on the following assumptions. The population size is fixed at \(n_0 + 1\). No birth, no death. The population contains only susceptibles and