Method of Lagrange multipliers

We have studied how to identify extreme values of two-variable functions f(x,y). In practice, we may have additional constraints. For example,

Find the extreme values of f(x,y) subject to the constraint g(x,y)=0.

We can use the method of Lagrange multipliers to solve this type of problem.



Let's consider the following example.

Problem. If (x,y) moves on the unit circle x2+y2=1, find the extreme values of f(x,y)=x2+xy+y2. □

This problem can be restated as follows: 

Problem (restated). Let g(x,y)=x2+y21. Find the extreme values of f(x,y)=x2+xy+y2 subject to the constraint g(x,y)=0. □

Let's solve this problem using an "explicit" method.

Solution 1 (Explicit method). Consider the implicit function y=1x2 of g(x,y) on the open interval (1,1) and substitute it to f(x,y) to have

h(x)=f(x,y(x))=x2+x1x2+(1x2)2=x1x2+1.

Since

h(x)=12x21x2,

h(x) has a local minimum at x=12 and a local maximum at x=12. Therefore, f(x,y) has a local minimum value 12 at point (12,12), and a local maximum value 32 at point (12,12) on the unit circle.

Similarly, using the implicit function y=1x2 on the open interval (1,1), we can define h(x)=x1x2+1 and find that f(x,y) has a local minimum value 12 at (12,12) and a local maximum value 32 at (12,12).

Next, we examine the neighbors of the points (1,0) and (1,0). In these cases, we consider the implicit functions x=1y2 (the branch passing through (1,0)) and x=1y2 (the branch passing through (1,0)), and find the same points as above to give local minimum and maximum.

In summary, f(x,y) has a local minimum value 12 at (x,y)=(±12,12), and a local maximum value 32 at (x,y)=(±12,±12). □

The above solution is explicit and easy to understand. However, this method applies only when the implicit functions of g(x,y)=0 can be obtained explicitly. If that is not the case, the following method of Lagrange (undetermined) multipliers provides an effective approach. We first show how to solve the above problem using the method of Lagrange multipliers. We will prove the method in general later.

Solution 2 (Lagrange multiplier). First, let us define a new function

F(x,y,λ)=f(x,y)λg(x,y)=(x2+xy+y2)λ(x2+y21)

where λ is a new variable (undetermined constant) called the Lagrange multiplier. Note that we have F(x,y)=f(x,y) if the constraint g(x,y)=0 is satisfied. We now consider the extreme value problem of this function.

Next, we differentiate F(x,y) with respect to x and y and solve

(Eq:lagx)Fx(x,y,λ)=2(1λ)x+y=0,(Eq:lagy)Fy(x,y,λ)=x+2(1λ)y=0.

Furthermore,

(Eq:lambda)Fλ(x,y,λ)=(x2+y21)=0

should also be satisfied. The latter equation implies, in particular, that (x,y)(0,0) (i.e., x and y cannot be simultaneously equal to 0). However, from (Eq:lagx) and (Eq:lagy) above, if x=0, then y=0 and vice versa, which is a contradiction. Thus, both x and y are non-zero. Eliminating, for example, y from (Eq:lagx) and (Eq:lagy) and some rearrangement gives

x(2λ1)(2λ3)=0.

Since x0, we have λ=12,32. For each of these values of λ, we solve the simultaneous equations (Eq:lagx) and (Eq:lagy) for x and y:

  • For λ=12, we have x=y.
  • For λ=32, we have  x=y.

Combining with (Eq:lambda), we have

  • For λ=12, we have (x,y)=(±12,12).
  • For λ=32, we have  (x,y)=(±12,±12).
In order to see if f(x,y) indeed has extreme values at these points, let us examine its behavior in the neighbor of these points (a,b)=(±12,12),(±12,±12).

Since b0, each point on the unit circle g(x,y)=0 can be expressed as (x,φ(x)) where y=φ(x) is an implicit function of g(x,y)=0. Thus, we need to solve the extreme value problem of 
h(x)=f(x,φ(x)).
Differentiating the equation of constraint g(x,φ(x))=0, or
x2+{φ(x)}21=0
with respect to x, we have
2x+2φ(x)φ(x)=0.
From this, we have
φ(x)=xφ(x).
Using the multivariate chain rule, we obtain
h(x)=fx(x,φ(x))+fy(x,φ(x))(xφ(x))=12x2φ(x),h(x)=4xφ(x)(12x2)φ(x){φ(x)}2=2x3+[14{φ(x)}2]x{φ(x)}3.

If (x,φ(x))=(±12,12) (corresponding to λ=12),  we have h(x)=0 and h(x)=4>0 so h(x) has a local minimum value 12.

If (x,φ(x))=(±12,±12) (corresponding to λ=32),  we have h(x)=0 and h(x)=4<0 so h(x) has a local maximum value 32. □

It should be stressed that using the method of Lagrange multipliers, we did not need to know the explicit form of the implicit function φ(x). That is, the implicit function remains, in fact, implicit. The mere existence of the implicit function φ(x) is sufficient. Thus, this method is applicable even when the implicit function is hard (if possible) to find.

Remark. In general, constrained extreme value problems can be solved by the following procedure:
  1. Enumerate the possible extreme points by the method of Lagrange multipliers.
  2. For each possible extreme point, consider the implicit function in the neighbor of the point and evaluate the first and second derivatives.

We now give the method of Lagrange multipliers as a theorem.

Theorem [Method of Lagrange multipliers]

Let f(x,y) and g(x,y) be functions of class C1. With a new variable λ, let us define
F(x,y,λ)=f(x,y)λg(x,y).
Suppose the point (a,b) satisfies the following conditions:
  1. The function f(x,y) has an extreme value at (x,y)=(a,b) subject to the constraint g(x,y)=0 (in particular, g(a,b)=0);
  2. (a,b) is a regular point of g(x,y)=0 (i.e., it is not the case that gx(a,b)=gy(a,b)=0).
Then, there exists αR such that
Fx(a,b,α)=Fy(a,b,α)=Fλ(a,b,α)=0.
Proof. By condition (1), Fλ(a,b,α)=g(a,b)=0. Thus, it suffices to show that there exists αR such that
(Eq:lagax)Fx(a,b,α)=fx(a,b)αgx(a,b)=0,(Eq:lagay)Fy(a,b,α)=fy(a,b)αgy(a,b)=0.

By condition (2), gx(a,b)0 or gy(a,b)0. Without loss of generality (what does this mean?), we may assume gy(a,b)0. By the Implicit Function Theorem, there exists a function y=φ(x) in a neighbor of x=a such that b=φ(a) and g(x,φ(x))=0. In the neighbor of (a,b), any point satisfying g(x,y)=0 is of the form (x,φ(x)). Therefore, we need to solve the extreme value problem of the univariate function f(x,φ(x)) in the neighbor of x=a. Note that
φ(x)=gx(x,φ(x))gy(x,φ(x)).
Differentiating f(x,φ(x)) with respect to x, we have
fx(x,φ(x))+fy(x,φ(x))φ(x)=fx(x,φ(x))fy(x,φ(x))gx(x,φ(x))gy(x,φ(x)).
By assumption, f(x,φ(x)) has an extreme value at x=a,
fx(a,φ(a))=fy(a,φ(a))=0.
But b=φ(a) so that
fx(a,b)fy(a,b)gx(a,b)gy(a,b)=0.
It follows that
fx(a,b)gy(a,b)fy(a,b)gx(a,b)=0.
Based on this, let us define
α=fy(a,b)gy(a,b).
Clearly, this α satisfies (Eq:lagax}) and (Eq:lagay). ■







Comments

Popular posts from this blog

Birth process

Branching processes: Mean and variance

Informal introduction to formal logic