Multivariate chain rules



Next, we consider differentiating a composite function such as f(φ(t),ψ(t)) or f(φ(u,v),ψ(u,v)). In the case of univariate functions, we have the chain rule. That is,

dg(f(x))dx=g(f(x))f(x).

We have corresponding multivariate versions.

Theorem [Chain rule (1)]

Let f(x,y) be a totally differentiable function on an open region U(R2). Let x=φ(t),y=ψ(t) be differentiable functions on an open interval I. Suppose that (φ(t),ψ(t))U for all tI. Then the function z=f(φ(t),ψ(t)) of t on I is differentiable on I, and its derivative is given by

(Eq:ChainRule1)ddtf(φ(t),ψ(t))=fx(φ(t),ψ(t))dφdt(t)+fy(φ(t),ψ(t))dψdt(t).

Remark.  Eq. (Eq:ChainRule1) may be a little too cluttered and difficult to read. We could simplify the notation by using dependent variables:

dzdt(t)=zx(x,y)dxdt(t)+zy(x,y)dydt(t).

Or, even more simply,

dzdt=zxdxdt+zydydt.

But, Eq. (Eq:ChainRule1) is the most accurate expression. □

Proof. For an arbitrary t0I, let x0=φ(t0) and y0=ψ(t0). For an arbitrary real number δ such that t0+δI, let h(δ)=φ(t0+δ)φ(t0) and k(δ)=ψ(t0+δ)ψ(t0). We have

limδ0h(δ)δ=dφdt(t0),limδ0k(δ)δ=dψdt(t0).

Since f(x,y) is totally differentiable,

f(φ(t0+δ),ψ(t0+δ))f(φ(t0),ψ(t0))=f(x0+h,y0+k)f(x0,y0)=f(x0,y0)xh+f(x0,y0)yk+o(h2+k2).

As δ0,

o(h2+k2)δ=hδo(h2+k2)hdφ(t0)dt0=0.

Therefore,

limδ0f(φ(t0+δ),ψ(t0+δ))f(φ(t0),ψ(t0))δ(Eq:Dt0)=fx(x0,y0)dφdt(t0)+fy(x0,y0)dψdt(t0).

This shows that the function z=f(φ(t),ψ(t)) is differentiable at t=t0 and its differential coefficient is given by the right-hand side of Eq. (Eq:Dt0). ■

Example. If z=f(x,y),x=at+b,y=ct+d, then

dzdt=zxdxdt+zydydt=afx(at+b,ct+d)+cfy(at+b,ct+d).

Theorem [Chain rule (2)]

Let z=f(x,y) be a totally differentiable function on an open region UR2. Let x=φ(u,v),y=ψ(u,v) be functions on an open region VR2 such that (φ(u,v),ψ(u,v))U for all (u,v)V. Suppose that the partial derivatives φu(u,v),φv(u,v), ψu(u,v),ψv(u,v) exist on V. Then, the function z=f(φ(u,v),ψ(u,v)) on V has the partial derivatives with respect to u and v given by

uf(φ(u,v),ψ(u,v))=fx(φ(u,v),ψ(u,v))φu(u,v)+fy(φ(u,v),ψ(u,v))ψu(u,v) ,vf(φ(u,v),ψ(u,v))=fx(φ(u,v),ψ(u,v))φv(u,v)+fy(φ(u,v),ψ(u,v))ψv(u,v) .

Remark. In a more simplified notation, the above partial derivatives may be written as

zu=zxxu+zyyu,zv=zxxv+zyyv.

Proof. The partial differential coefficient

z(u0,v0)u

at (u,v)=(u0,v0) is the same as the differential coefficient of the "one-variable" function z=f(φ(u,v0),ψ(u,v0)) at u=u0 where v=v0 is fixed. Then, we can apply the above Theorem [Chain Rule (1)]. Noting that the differential coefficients of the one-variable functions φ(u,v0) and ψ(u,v0) are calculated as φ(u0,v0)u and ψ(u0,v0)u, respectively, we have the claimed result. The same argument applies to the partial derivative with respect to v. ■

Example. Let f(x,y)=ex2+y2 and φ(u,v)=ucosv,ψ(u,v)=usinv. Let g(u,v)=f(φ(u,v),ψ(u,v)). Find gu(u,v) and gv(u,v). Let x=ucosv and y=usinv. Then, x2+y2=u2. Thus,

gu(u,v)=uf(φ(u,v),ψ(u,v))=fx(φ(u,v),ψ(u,v))φu(u,v)+fy(φ(u,v),ψ(u,v))ψu(u,v)=[eu22ucosv]cosv+[eu22usinv]sinv=2ueu2,gv(u,v)=vf(φ(u,v),ψ(u,v))=fx(φ(u,v),ψ(u,v))φv(u,v)+fy(φ(u,v),ψ(u,v))ψv(u,v)=[eu22ucosv](usinv)+[eu22usinv](ucosv)=0.





Comments

Popular posts from this blog

Birth process

Branching processes: Mean and variance

Informal introduction to formal logic