Differentiation of more general maps RnRm

 We now consider the differentiation of more general maps RnRm. We can import many results from the case of bivariate functions, R2R.



Multivariate functions RnR

First, we consider the general multivariate function y=f(x):RnR where x=(x1,x2,,xn)Rn.

Definition (Total differentiability)

Let U be an open region in Rn. For the function f(x) on U and a=(a1,a2,,an)U, f(x) is said to be (totally) differentiable at a if there exist constants m1,m2,,mn such that

f(x)=f(a)+m1(x1a1)+m2(x2a2)++mn(xnan)+o(xa)

as xa. f(x) is said to be totally differentiable on U if f(x) is totally differentiable at all points in U.

When n=2, this definition matches the definition of total differentiability of two-variable functions. 

See also: Partial and total differentiation of multivariate functions

When n=1, this definition matches the definition of differentiability of univariate (one-variable) functions. Similarly to the cases of n=1 and n=2, if an n-variable function f(x)=f(x1,x2,,xn) is totally differentiable at a=(a1,a2,,an), then

mi=fxi(a), i=1,2,,n.

Theorem 1 (Total differentiability implies continuity)

If the function f(x)=f(x1,x2,,xn) is totally differentiable at a=(a1,a2,,an), then it is continuous at x=a.

Proof. "Trivial." (Similar to the case of n=2.) ■

See also: Total differentiability implies continuity.

Theorem 2 (Criterion of total differentiability)

Let U be an open region in Rn and f(x)=f(x1,x2,,xn) be a function on U and a=(a1,a2,,an)U. If all the partial derivatives fxi(x) (i=1,2,,n) exist and they are continuous at x=a, then f(x) is totally differentiable at x=a.

Proof. "Trivial." (Similar to the case of n=2.) ■

See also: Total differentiability implies continuity

Definition (Functions of class C1)

The function f(x)=f(x1,x2,,xn) on U is said to be once continuously differentiable or of class C1 if it has all the derivatives fxi(x) (i=1,2,,n) and they are continuous on U. 

Remark. By the above Theorem 2, functions of class C1 are totally differentiable, and, by Theorem 1, they are continuous. □

Similarly to the case with n=2, we may define higher order partial derivatives such as

fxixj(x)=2fxjxi(x)

where i,j=1,2,,n.

See also: Higher-order partial differentiation.

Theorem 3 (Changing the order of partial differentiation)

Suppose that the function f(x)=f(x1,x2,,xn) on an open region U has second partial derivatives fxixj(x) and fxjxi(x)  (i,j=1,2,,n) which are continuous. Then fxixj(x)=fxjxi(x).

Proof. All variables other than xi and xj may be regarded as constants in the derivatives fxixj(x) and fxjxi(x). Then the proof is reduced to the case with n=2. ■

See also: Higher-order partial differentiation

Definition (Functions of class Cr)

Let f(x)=f(x1,x2,,xn) be a function on an open region U and r be a non-negative integer.

  1. f(x) is said to be r-times continuously differentiable or of class Cr if it has all the derivatives up to the r-th order which are continuous on U.
  2. f(x) is said to be infinitely differentiable or smooth or of class C if f(x) has derivatives of all orders which are continuous.

For example, if f(x) is of class C0 on U, then f(x) is continuous on U. As is the case of n=2, (up to) the r-th derivatives of a function of class Cr are determined by the number of differentiation by each variable xi (i=1,2,,n) and independent of the order of differentiations.

Cr maps RnRm

Now, we consider general maps: RnRm.

Definition (Cr maps)

Let U be an open region in Rn. Let
F(x)=(f1(x),f2(x),,fm(x)), x=(x1,x2,,xn),
be a map from U to Rm (i.e., F:URm). Then, for each k=1,2,,m, fk(x)=fk(x1,x2,,xn) is a function on U (i.e., fk:UR, k=1,2,,m). If all fk are functions of class Cr, then the map F(x) is said to be of class Cr.

Composite maps

Let U and V be open regions in Rn and Rm, respectively. Consider the maps F:URm and G:VRl,
F(x)=(f1(x),f2(x),,fm(x)), x=(x1,x2,,xn),G(y)=(g1(y),g2(y),,gl(y)), y=(y1,y2,,ym).
Suppose that F(U)V. Recall that F(U) is the image of U by F:
F(U)={F(x)xU}.
Then we can define the composite map GF:URl by
(GF)(x)=(h1(x),h2(x),,hl(x))
where
hk(x)=gk(f1(x),f2(x),,fm(x)),k=1,2,,l.

Of the map F(x), each of the m components, f1(x),f2(x),,fm(x), is a function of n independent variables x1,x2,,xn. Thus, F(x) has mn derivatives

fjxi(x), x=(x1,x2,,xn);i=1,,n;j=1,,m.

Of the map G(y), each of the l components, g1(y),g2(y),,gl(y), is a function of m independent variables, y1,y2,,ym. Thus, G(x) has lm derivatives

gkyj(y), y=(y1,y2,,ym);j=1,,m;k=1,,l.

Accordingly, of the composite map (GF)(x), each component hk(x) is a function of n independent variables x1,x2,,xn. Thus, it has ln derivatives

hkxi(x), x=(x1,x2,,xn);i=1,,n;k=1,,l.

Combining these results, we have the chain rule for general maps:

Theorem (Chain rule)

Let F and G be maps of class C1. Then, their composite GF is also of class C1, and for all k=1,2,,l and i=1,2,,n, the following equation holds:

(Eq:Chain)hkxi(x)=j=1mgkyj(F(x))fjxi(x).

Proof. For each k=1,2,,l, if we consider only hk(x) (one k at a time), then it suffices to consider the case when l=1. When considering a derivative with respect to each xi, we may assume other independent variables are constant so that it suffices to consider the case where n=1. Thus, the problem is reduced to the case where z=h(y1,y2,,ym) and yj=gj(x) (xR) are composed. m=2 is the bivariate case. The case with general m can be proved similarly. (See also: Multivariate chain rules.)

Lastly, by (Eq:Chain), the partial derivative hkxi(x) is continuous (the sum and product of continuous functions are continuous). Therefore, GF is of class C1

Remark. If we write

yj=fj(x1,x2,,xn), (j=1,2,,m),

and

zk=gk(y1,y2,,ym), (k=1,2,,l),

then (Eq:Chain) can be written as

zkxi=j=1mzkyjyjxi(Eq:Chain2)=zky1y1xi+zky2y2xi++zkymymxi.

Definition (Jacobian)

Let URn be an open region. For the map F(x)=(f1(x),f2(x),,fm(x)):URm, we can define a matrix whose (i,j)-element is fixj(a) where a=(a1,a2,,an)U:

JF(a)=(fixj(a))=(f1x1(a)f1x2(a)f1xn(a)f2x1(a)f2x2(a)f2xn(a)fmx1(a)fmx2(a)fmxn(a)).

This matrix is called the Jacobian matrix, or simply, Jacobian, of the map F(x) at x=a.

Consider the case with m=1. For the function f(x)=f(x1,x2,,xn), the Jacobian is a row vector

Jf(a)=(fx1(a),fx2(a),,fxn(a)).

This vector defines a linear function on the n-dimensional vector space: RnR,

v=(v1v2vn)Jf(a)v=fx1(a)v1+fx2(a)v2++fxn(a)vn.

This function gives the linear (first-order) term in the asymptotic expansion:

f(x)=f(a)+{fx1(a)v1+fx2(a)v2++fxn(a)vn}+o(xa))

where v1=x1a1,v2=x2a2,,vn=xnan.

This idea can be extended to the case with general m. For the map F:URm, its Jacobian JF(a) induces the linear approximation of F(x) at x=a.

Example. Let x=2uv,y=4u+3v. Then

(xuxvyuyv)=(2143).

Let us restate the chain rule in terms of Jacobians.

The derivative of the composite GF is given by (Eq:Chain). In terms of Jacobians, we have

(h1x1(a)h1x2(a)h1xn(a)h2x1(a)h2x2(a)h2xn(a)hlx1(a)hlx2(a)hlxn(a))=(g1y1(F(a))g1y2(F(a))g1ym(F(a))g2y1(F(a))g2y2(F(a))g2ym(F(a))gly1(F(a))gly2(F(a))glym(F(a)))(f1x1(a)f1x2(a)f1xn(a)f2x1(a)f2x2(a)f2xn(a)fmx1(a)fmx2(a)fmxn(a)),

or

(Eq:MatChain)JGF(a)=JG(F(a))JF(a).

(After you learn more linear algebra, you will understand the following...)

A matrix represents a linear map. The product of matrices corresponds to the composition of the corresponding linear maps. (Eq:MatChain) indicates that the linear approximation (JGF(a)) of the composition of maps is equal to the composition of the linear approximations (JG(F(a)) and JF(a)) of the maps. In short,

The linear approximation of the composition of maps is the composition of the linear approximations of the maps.

In short, composition and linear approximation are commutative.


Comments

Popular posts from this blog

Birth process

Branching processes: Mean and variance

Informal introduction to formal logic